unsigned vs signed - Is Bjarne Mistaken?

Le Chaud Lapin

ulæst,

26. apr. 2003, 09.13.4126.04.2003

til

In Section 4.4 on page 73 of "The C++ Programming Language",
Stroustrup writes

"The unsigned integer types are ideal for uses that treat storage as a
bit array. Using an unsigned instead of an int to gain onr more bit
to represent positive integers is almost never a good idea. Attempts
to ensure that some values are positive by declaring variables
unsigned will typically be defeated by the implicit conversion rules."

In Section 4.10, advice item [18] is:

"Avoid unsigned arithmetic"

I disagree with these two statments. I think that the opposite should
have been stated, that one should avoid *signed* arithmetic, unless
the context of whatever is being modeled implicitly prescribes signed.
To say that it irks me that some people actually blindly follow this
advice would be an understatment.

But before I rant, and I do intend to rant, I would like to here the
opinions of others.

Who agrees with Bjarne and who disagrees?

Regards,

-Chaud Lapin-

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Ivan Vecerina

ulæst,

27. apr. 2003, 06.53.3427.04.2003

til

"Le Chaud Lapin" <unorigina...@yahoo.com> wrote in message
news:fc2e0ade.03042...@posting.google.com...
: In Section 4.4 on page 73 of "The C++ Programming Language",
: Stroustrup writes
....
: "Avoid unsigned arithmetic"

:
: I disagree with these two statments.

.....
: Who agrees with Bjarne and who disagrees?

Scott Meyers, among other experts, agrees. See for example
http://www.aristeia.com/Papers/C++ReportColumns/sep95.pdf

It is confusing however, that the C and C++ standards
like to rely on unsigned types in several places.

My own personal taste, admittedly, would be to prefer
unsigned types. Not to win a bit of storage, but
to let the code be more explicit/descriptive.

The fact is that using unsigned instead of signed types
can create bugs -- as you lose information.
One typical example is:
for( unsigned i = v.size() ; --i >= 0 ; )
process( v[i] );

This might be enough to recommend the use of sighed types.

The question is:
What do you win by using unsigned types ?

--
Ivan Vecerina, Dr. med. <> http://www.post1.com/~ivec
Soft Dev Manger, XiTact <> http://www.xitact.com
Brainbench MVP for C++ <> http://www.brainbench.com

Jack Klein

ulæst,

27. apr. 2003, 06.59.0727.04.2003

til

On 26 Apr 2003 09:13:41 -0400, unorigina...@yahoo.com (Le Chaud
Lapin) wrote in comp.lang.c++.moderated:

> In Section 4.4 on page 73 of "The C++ Programming Language",
> Stroustrup writes
>
> "The unsigned integer types are ideal for uses that treat storage as a
> bit array. Using an unsigned instead of an int to gain onr more bit
> to represent positive integers is almost never a good idea. Attempts
> to ensure that some values are positive by declaring variables
> unsigned will typically be defeated by the implicit conversion rules."
>
> In Section 4.10, advice item [18] is:
>
> "Avoid unsigned arithmetic"
>
> I disagree with these two statments. I think that the opposite should
> have been stated, that one should avoid *signed* arithmetic, unless
> the context of whatever is being modeled implicitly prescribes signed.
> To say that it irks me that some people actually blindly follow this
> advice would be an understatment.
>
> But before I rant, and I do intend to rant, I would like to here the
> opinions of others.
>
> Who agrees with Bjarne and who disagrees?
>
> Regards,
>
> -Chaud Lapin-

I both agree and disagree with Bjarne on this subject.

Unsigned types are the best choice for bit fiddling, and are often a
good choice for values that can never be negative size_t being one
example, since objects can't have negative sizes.

On the other hand, potential for errors crops up when mixing signed
and unsigned types, especially of different widths, in expressions.

Consider:

unsigned short us = 3;
int i = -1;

if (i < us)
{
/* stuff */
}

While the loop body be executed? Completely implementation-defined.
Let's look at the "usual arithmetic conversions" rules:

When a "narrower" unsigned type must be converted to a "wider" type,
there are two possible outcomes:

1. If the signed variant of the "wider" type can hold all values of
the "narrower" unsigned type, the promotion is to the signed wider
type.

2. If the signed variant of the wider type can't hold all values of
the narrower unsigned type, the promotion is to the unsigned wider
type.

Note that the conversion depends on the range of values for the types,
not the actual current value of the narrower type being promoted.

So on a compiler where short and int share the same representation (16
bit architectures, many digital signal processors), us is promoted to
an unsigned int with the value of 3. This requires that i also be
converted to unsigned int, and converting -1 to unsigned int results
in the value UINT_MAX. Since UINT_MAX must be greater than 3, the
body is not executed.

On a typical 32 bit desk top architecture today, short contains 16
bits and int contains 32 bits. Signed int can hold all possible
values of an unsigned short, so us is converted to a signed int with
the value of 3 and i is not converted at all. Now i < us so the loop
body executes.

One could dismiss this on the basis that architectures where short and
int have the same representation are only a problem in niches like DSP
applications and other embedded systems, but the same problem shows up
on typical desk tops with other data types:

unsigned int ui = 3;
long l = -1;

if (l < ui)
{
/* stuff */
}

In this case, the body executes on a 16 bit implementation where int
has 16 bits and long has 32. If does not execute on typical 32 bit
platforms where int and long have the same representation because both
ui and l convert to unsigned long.

Also consider a typical newbie implementation of a comparison
function, such as for passing to qsort:

return val1 - val2;

If the intent is to return a positive value when val1 > val2, 0 when
they are equal, and a negative value when val1 < val2, it fails in the
third case.

Care must be taken when using unsigned types in arithmetic
expressions, special care then the expression mix signed and unsigned
types.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq

Terje Slettebø

ulæst,

27. apr. 2003, 07.00.0527.04.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) wrote in message news:<fc2e0ade.03042...@posting.google.com>...

> In Section 4.4 on page 73 of "The C++ Programming Language",
> Stroustrup writes
>
> "The unsigned integer types are ideal for uses that treat storage as a
> bit array. Using an unsigned instead of an int to gain onr more bit
> to represent positive integers is almost never a good idea. Attempts
> to ensure that some values are positive by declaring variables
> unsigned will typically be defeated by the implicit conversion rules."
>
> In Section 4.10, advice item [18] is:
>
> "Avoid unsigned arithmetic"
>
> I disagree with these two statments. I think that the opposite should
> have been stated, that one should avoid *signed* arithmetic, unless
> the context of whatever is being modeled implicitly prescribes signed.
> To say that it irks me that some people actually blindly follow this
> advice would be an understatment.
>
> But before I rant, and I do intend to rant, I would like to here the
> opinions of others.
>
> Who agrees with Bjarne and who disagrees?

There was a very long thread about this one and a half year ago
(http://groups.google.com/groups?selm=3BFEEAE8.196852BD%40bawi.org).
You may want to check it out.

It seems to me that the consensus more or less was with Bjarne
Stroustrup. The reason is especially the implicit conversion rules
that he mentions.

At the time, I argued that if a function's parameter can't be
negative, it may add semantics to the code to enforce that, by making
the parameter unsigned.

However, if you pass a signed value to it, it will be converted to
unsigned, so that a small negative value may result in a large
positive value. Therefore, it doesn't really enforce that only
positive values are passed to it.

Regards,

Terje

JKB

ulæst,

27. apr. 2003, 07.01.2027.04.2003

til

Regarding Stroustrup's:

>"The unsigned integer types are ideal for uses that treat storage as a
>bit array. Using an unsigned instead of an int to gain onr more bit
>to represent positive integers is almost never a good idea. Attempts
>to ensure that some values are positive by declaring variables
>unsigned will typically be defeated by the implicit conversion rules."

> ... "Avoid unsigned arithmetic"

unorigina...@yahoo.com (Le Chaud Lapin) wrote:
>I disagree with these two statments. I think that the opposite should
>have been stated, that one should avoid *signed* arithmetic, unless
>the context of whatever is being modeled implicitly prescribes signed.

If the context requires one or the other, then of course you use
whatever is needed. And it seems pretty clear that unsigned is best
for use as a bit array. So the only interesting question is what type
to use when the exact type does not matter.

> To say that it irks me that some people actually blindly follow this
>advice would be an understatment.

But you are committing the same error, by asserting without evidence
that one should avoid signed arithmetic. Anyone accepting your advice
is doing so blindly and hence you should be irked by that too.

>But before I rant, and I do intend to rant, I would like to here the
>opinions of others. Who agrees with Bjarne and who disagrees?

Except that this is not a voting process. A variety of people post to
this group, from Wise Sages to Complete Nitwits. Counting the votes
won't get you anything useful.

My own opinion is that one should use unsigned for bit arrays or for
the specific cases when you need the extra integer range. Use signed
integers by default for situations where there might be arithmetic.
Use typedefs to represent specific integer sizes, using 'int' only
when you don't care about its size. And 'char' should be unsigned.

But those are opinions. Specific reasons would always override them.
-- jkb

Carlos Moreno

ulæst,

27. apr. 2003, 07.01.4227.04.2003

til

Le Chaud Lapin wrote:
>
> "The unsigned integer types are ideal for uses that treat storage as a
> bit array. Using an unsigned instead of an int to gain onr more bit
> to represent positive integers is almost never a good idea. Attempts
> to ensure that some values are positive by declaring variables
> unsigned will typically be defeated by the implicit conversion rules."

I have to almost entirely agree with this. The last statement is
particularly convincing:

bool f (unsigned int n)

You put the unsigned int to have the *guarantee* that you don't
receive negative values leads to ugly surprises when someone
*actually passes* a negative value (which you can not prevent).

the call f(-1) will be treated by you as f(4294967295) (on a
32-bit platform, that is).

If you must ensure that no negative numbers are passed, you're
better off receiving an int, and checking that the parameter is
>= 0.

Now, if you use the unsigned in the parameter to *tell* client
code that you're expecting a non-negative number, or to treat it
as a bitmask (as Stroustrup mentions), then it's fine.

As for the "avoid unsigned arithmetic" or "avoid signed
arithmetic", I'm not sure I agree with either one. Most
definitely I would NOT avoid signed arithmetic. Why would I
avoid it?

I'm not sure about the reasons for Stroustrup recommending
to avoid unsigned arithmetic, but if I must choose one of
these two statements to support, I would definitely go for
"avoid unsigned arithmetic" (implicit conversions from a
negative int that accidentally goes in the arithmetic
operation causes your program to fail). Signed arithmetic
allows you to deal with positive numbers if you want.
Unsigned arithmetic doesn't allow you to deal with negative
numbers -- and as soon as you put a negative number in (say,
by accident), you completely screw up the whole thing.

In other words, code that deals exclusively with ints
and receives an unsigned int whose value is representable
by an int (which should be the most likely scenario), will
work fine. Code that deals with unsigned ints and receives
an int with a negative value (which has an apriori 50%-50%
odds, after all), will fail miserably.

> But before I rant, and I do intend to rant, I would like to here the
> opinions of others.

Those are my 2 cents. I'm sure that there are gazillions
of arguments either way (i.e., tons of arguments in favor
of one, tons of arguments in favor of the other, and tons
of arguments in favor of "neither statement is good")

Cheers,

Carlos
--

Bastian Pflieger

ulæst,

27. apr. 2003, 07.57.5827.04.2003

til

Hi

One advantage to use plain int's / long's is that the conversion rules
are more predictable and reduced to a few ones.

So, in _general_ I agree with him (aka Bjarne :).
I think most people use common sense to determine which type is needed,
instead of blindly following the advice.

bastian

Dave Harris

ulæst,

27. apr. 2003, 07.59.1327.04.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) wrote (abridged):

> Who agrees with Bjarne and who disagrees?

I agree with Bjarne. Mainly because I find signed types ubiquitous. For
example, 0 is a signed int literal. So is 42. And the difference between
two numbers may often be negative. And I often find it convenient to use
negative numbers to represent errors or not-found or other information
in-band.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Andrea Griffini

ulæst,

27. apr. 2003, 08.00.2627.04.2003

til

On 26 Apr 2003 09:13:41 -0400, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

>Who agrees with Bjarne and who disagrees?

I think that unsigned is nice because of the modulo property,
but unless that property is the main idea I would suggest to
stay away from it. I agree that just one bit more doesn't
change your life, so if 31 bits aren't enough then 32 soon
won't be enough either ... or this is at least the common case.

Actually I think that using an unsigned as the value of size()
was everything but smart in the standard library; it makes
things worse for no reason. On the pratical side it has been
in my experience also a source of bugs in things like
"if (i<v.size()-1)" that's different from "if (i+1<v.size())".

If you have a value that logically should be always non
negative then IMO you are *not* doing a favor to yourself by
allocating it in an unsigned variable. The reason is that
an unsigned variable can't phisically be negative, and
you're throwing away a possibility of error detection.
If you say

size -= 3;

and the previous value was 2 then I think it's better
having size == -1 than having it containing a nonsense
but positive big number; at least with -1 you *know*
that is nonsense as a size and so you can take proper
actions about it. Using unsigned for values in this
context just *hides* errors.

Andrea

Thomas Richter

ulæst,

27. apr. 2003, 08.00.4427.04.2003

til

Hi,

> In Section 4.10, advice item [18] is:

> "Avoid unsigned arithmetic"

> I disagree with these two statments.

> But before I rant, and I do intend to rant, I would like to here the
> opinions of others.

Well, I think basically Bjarne is correct here. You often face the situation
where you have to consider differences of two values. If you consider
unsigned arithmetic, these differences will never be negative, and a certain
surprise factor is likely. For example, for unsigned types

a >= b and a - b >= 0

are not equivalent, whereas for signed types, both are equivalent under the
rather mild assumption that no overflow happens ("types are considered large
enough for the data you'd need to represent). Strictly speaking, the "no overflow"
rule is of course the same for unsigned, except that the overflow condition is
true in half of the useful cases, roughly speaking. (-;

A typical example would be a program that needs to identify pixels on a screen:
Event though all useful pixels are likely to be addressed by unsigned values
in the range of the screen resolution, it is better to use signed values here since
it allows you to represent pixel differences as well (e.g. for line drawing and
clipping algorithms) with the same data type, and without having to be overly
careful with automatic casting.

However, I have to confess that I personally prefer "unsigned" for data types
where it looks more natural to me, e.g. counting a number of elements, encoding
a "size" of an object, and for all kinds of "bit juggling". However, one must
be aware that bindly casting these types is likely to cause surprises.

Greetings,
Thomas

Alf P. Steinbach

ulæst,

28. apr. 2003, 08.40.0128.04.2003

til

On 26 Apr 2003 09:13:41 -0400, unorigina...@yahoo.com (Le Chaud Lapin) wrote:

>In Section 4.4 on page 73 of "The C++ Programming Language",
>Stroustrup writes
>
>"The unsigned integer types are ideal for uses that treat storage as a
>bit array. Using an unsigned instead of an int to gain onr more bit
>to represent positive integers is almost never a good idea. Attempts
>to ensure that some values are positive by declaring variables
>unsigned will typically be defeated by the implicit conversion rules."
>
>In Section 4.10, advice item [18] is:
>
>"Avoid unsigned arithmetic"
>

>... I would like to here the opinions of others.

Many here have stated that because because mixed signed/unsigned
arithmetic can easily lead to subtle errors, signed arithmetic should
be used whenever possible.

That is, of course, a fallacy, with almost no connection from premise
to conclusion. But due to the prevalence of examples where the
natural choice would be signed, and the artificial introduction of
unsigned creates some problem, many people believe it. Note, however,
that each such example can easily be transformed to a counter-example.

The opposite conclusion, "because mixed arithmetic can easily lead
to subtle errors, unsigned arithmetic should be used whenever possible",
is just as (in)valid as the first one.

IMO the only reasonable conclusion is to use a compiler that emits
warnings for mixed arithmetic.

With that as a premise I tend to use unsigned types whenever an
integer should not be negative. Catering to negative values by
using a signed type would just be putting your head in the sand,
like an ostrich (it's the same kind of reasoning that makes people
avoid using exceptions: if we don't see an error, it doesn't exist).
Unfortunately many do exactly that, using signed arithmetic to avoid
seeing, and what's more, require or admonish others to follow suit.

Hth.,

- Alf

Ostriches that respond to danger by putting their heads in the sand
are, to be fair, a myth, but it's also a commonly understood turn of
phrase. See <url:http://www.sirlinksalot.net/ostrich.html> for more
info about these wonderful flightless birds. Who can _run_ very fast.

Thomas Mang

ulæst,

28. apr. 2003, 08.55.4628.04.2003

til

Carlos Moreno schrieb:

> Le Chaud Lapin wrote:
> >
> > "The unsigned integer types are ideal for uses that treat storage as a
> > bit array. Using an unsigned instead of an int to gain onr more bit
> > to represent positive integers is almost never a good idea. Attempts
> > to ensure that some values are positive by declaring variables
> > unsigned will typically be defeated by the implicit conversion rules."
>
> I have to almost entirely agree with this. The last statement is
> particularly convincing:
>
> bool f (unsigned int n)
>
> You put the unsigned int to have the *guarantee* that you don't
> receive negative values leads to ugly surprises when someone
> *actually passes* a negative value (which you can not prevent).
>
> the call f(-1) will be treated by you as f(4294967295) (on a
> 32-bit platform, that is).

But the Standard Library uses this often:

23.1, container requirements: X::size_type -> unsigned integral type
23.2. container classes, such as vector, have several member functions(e.g.
for vector: resize , reserve, operator[], ...) that take size_type as
argument.

If one follows Stroustroups advice strictly, then the range of values of
unsigned arithmetic should be limited to the range of positive values of
signed arithmetic; As that's not the case (and the Standard Library clearly
shows us that not being the case), my advice is to learn the promotion /
conversion rules, stay watchful and decide whether to use unsigned or signed
arithmetic depending on the concrete problem.
For example, I agree with the Standard Libary and would use for types
representing physical sizes, which are never negative, unsigned types - even
if signed arithmetic would provide the appropriate range of positive values.

Indeed, I think in practice I end up using unsigned arithmetic much more
frequently than signed arithmetic. But that is not to take as advice; I am
sure others end up using signed arithmetic more, because they solve different
problems where different conditions apply.
My advice is to learn the pitfalls of promotions / conversions, and decide
whether to use signed or unsigned arithmetic depending on your specific
problem, and not follow blindly some general advice.

best regards,

Thomas

Andrea Griffini

ulæst,

28. apr. 2003, 08.57.0828.04.2003

til

On 27 Apr 2003 07:00:05 -0400, terjes.@chello.no (Terje Sletteb)
wrote:

>Therefore, it doesn't really enforce that only
>positive values are passed to it.

It does just the opposite: it hides the fact that
a negative number has been passed. So the callee
can't even discover the problem with a check on
the input parameters.

Andrea

Le Chaud Lapin

ulæst,

28. apr. 2003, 09.00.5128.04.2003

til

"Ivan Vecerina" <iv...@myrealbox.com> wrote in message news:<3eaa...@news.swissonline.ch>...

> Scott Meyers, among other experts, agrees. See for example
> http://www.aristeia.com/Papers/C++ReportColumns/sep95.pdf

Uh, oh...believe it or not, it took me almost 5 months to "unconvince"
a senior software engineer of something that Scott Meyers had
recommended in his infamous book (that it is better to use members
that are pointers to objects instead of members that are the objects
themselves). But that's another matter...

> My own personal taste, admittedly, would be to prefer
> unsigned types. Not to win a bit of storage, but
> to let the code be more explicit/descriptive.

Ivan, IMO, this is the single most valid reason that any engineer
could ever possible give for preferring unsigned over signed. Your
perspective suggests something far more profound than what we are
discussing here.

> One typical example is:
> for( unsigned i = v.size() ; --i >= 0 ; )
> process( v[i] );

for( unsigned i = v.size() ; --i; )
process( v[i] );

> The question is:

> What do you win by using unsigned types ?

Exactly what you wrote above. Regularity in a system.

Best Regards,

-Chaud Lapin-

Le Chaud Lapin

ulæst,

28. apr. 2003, 09.01.3928.04.2003

til

Jack Klein <jack...@spamcop.net> wrote in message news:<rfolavk3hbmdva96q...@4ax.com>...

> unsigned short us = 3;
> int i = -1;

Programmer should be aware of what 'i' and 'us' represent, and
therefore do a cast. Since you have obviously found a (relatively
rare) situation where comparison between unsigned and signed makes
sense, then this extra sensitivity is warranted.

> if (i < us)
> {
> /* stuff */
> }
>

> 1. If the signed variant of the "wider" type can hold all values of...
> 2. If the signed variant of the wider type can't hold all values of...

> the narrower unsigned type, the promotion is to the unsigned wider
> type.
>
> Note that the conversion depends on the range of values for the types,
> not the actual current value of the narrower type being promoted.
>
> So on a compiler where short and int share the same representation (16
> bit architectures, many digital signal processors), us is promoted to
> an unsigned int with the value of 3. This requires that i also be
> converted to unsigned int, and converting -1 to unsigned int results
> in the value UINT_MAX. Since UINT_MAX must be greater than 3, the

> body is not executed....

This is all true, but a programmer that uses types that conceptually
reflect what they are trying to represent would never have to remember
these rules.

Again, in the unsigned long example you gave, programmer is making a
comparison that would be conceptually irregular without a cast.

Best Regards,

-Chaud Lapin-

Le Chaud Lapin

ulæst,

28. apr. 2003, 09.03.3228.04.2003

til

bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20030426...@brangdon.madasafish.com>...

> unorigina...@yahoo.com (Le Chaud Lapin) wrote (abridged):
> > Who agrees with Bjarne and who disagrees?
>
> I agree with Bjarne. Mainly because I find signed types ubiquitous. For
> example, 0 is a signed int literal. So is 42. And the difference between
> two numbers may often be negative.

Your note on '0' and '42' is entirely correct, which is one of the
reason someone on the C standards committee insisted (correctly IMO)
that one allow for integral constants that are inherently unsigned :
0U 42U

> And I often find it convenient to use
> negative numbers to represent errors or not-found or other information
> in-band.

And so we have it. This is most likely where the "problem" began. A
C programmer might have had a function whose return value should be
inherently unsigned:

unsigned int strlen();

Then have a second function whose return value is also inherently
unsigned:

unsigned int foo();

Then they discover that foo can have an error condition, something
that we now call an exception, so they 'overload' the return type of
foo:

int foo();

This is a *critical* departure from regularity. An engineer who
appreciates virtue in systems design should not underestimate how this
departure will affect his perception of other, unrelated elements in
his system. But in any case, this is what happens.

Here's how it starts to propagate. The programmer might have done
this in code:

int foo ()
{
....
return strlen(s);
....
}

Then said, "oh..no problem, I will just changed the return type of
strlen":

int strlen();

Any other variable or function that relies on the return value of
strlen(), an inherently non-negative quantity should be changed too.

int character_count = strlen(m);

Then finally, any function that expects a count of characters, again,
an inherently non-negative quantity, should be changed to:

void * buffer_allocator (int count_of_characters);

However, we don't want count_of_characters to ever be negative, so we
need a check:

void * buffer_allocator (int count_of_characters)
{
if (count_of_characters < 0)
.... // return 0..
}

And so the mess continues. All of this could have been avoided if we
had put square pegs in square holes and round pegs in round holes.
But! People are not perfect! Systems are not perfect! One time out
of a 1000, someone somewhare will break the notion of regularity
anyway. What should we do?

Simple. We should do the 'right' thing the other 999 times, and
contain the 1 time that the 'wrong' thing was done.

This is how you engineer elegance, by making each component as perfect
as possible in the absence of context, then synthesizing from those
components. You do not allow coceptually irregularity to propagate
through your system simply because some other component would find
that "harmless" irregularity convenient.

Best Regards,

-Chaud Lapin-

Hartwig Wiesmann

ulæst,

28. apr. 2003, 14.43.3728.04.2003

til

Hi Ivan,

you mentioned an often cited example. Unfortunately, this typical
example is wrong anyway (and this is overseen by a lot of people, too).
It does not matter whether you use signed or unsigned arithmetic because
you are accessing size+1 elements although you wanted to access "size"
elements only - I suppose. The right condition is "--i>0" and you have
to change the index to v[i-1] and everything works fine!
If you complain that there is an extra operation in "i-1" it is correct
but normally all optimizers remove this operation anyway. And if you do
not trust your compiler and need speed you should use iterators anyway.
From my point of view "unsigned" should always be used where unsigned
numbers are envolved. Unfortunately, there are some historical (C)
conversions to (signed) int instead of to unsigned int. But as I avoid
implicit conversions anyway, I do not see any danger.

Regards,

Hartwig

apm

ulæst,

28. apr. 2003, 14.45.2528.04.2003

til

bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20030426...@brangdon.madasafish.com>...

> unorigina...@yahoo.com (Le Chaud Lapin) wrote (abridged):
> > Who agrees with Bjarne and who disagrees?
>
> I agree with Bjarne. Mainly because I find signed types ubiquitous. For
> example, 0 is a signed int literal. So is 42. And the difference between
> two numbers may often be negative.

A difference cannot be negative, by definition.

-apm

Alf P. Steinbach

ulæst,

28. apr. 2003, 14.49.2228.04.2003

til

On 27 Apr 2003 08:00:26 -0400, Andrea Griffini <agr...@tin.it> wrote:

>If you have a value that logically should be always non
>negative then IMO you are *not* doing a favor to yourself by
>allocating it in an unsigned variable. The reason is that
>an unsigned variable can't phisically be negative, and
>you're throwing away a possibility of error detection.
>If you say
>
> size -= 3;
>
>and the previous value was 2 then I think it's better
>having size == -1 than having it containing a nonsense
>but positive big number; at least with -1 you *know*
>that is nonsense as a size and so you can take proper
>actions about it.

The above seems to amount to a policy of hunting down bugs
(due to misunderstandings) instead of preventing them (by
preventing misunderstandings).

> Using unsigned for values in this context just *hides* errors.

size = -1; // unsigned => warning, signed => no warning.

Cheers,

- Alf

Anders J. Munch

ulæst,

28. apr. 2003, 14.53.5628.04.2003

til

"Ivan Vecerina" <iv...@myrealbox.com> wrote:
>
> The question is:
> What do you win by using unsigned types ?

It's easier to write code that works for the full input domain.
Example:

void g_int(int);
void f_int(int val1, int val2)
{
g_int(val1 - val2);
}

Now with unsigned you can't write it that way. You have to
specialise for val2>val1:

void g_neg(unsigned);
void g_pos(unsigned);
void f_unsigned(unsigned val1, unsigned val2)
{
if(val1 < val2)
g_neg(val2 - val1);
else
g_pos(val1 - val2);
}

The unsigned case seems just more work, but f_unsigned has one
important feature which f_int hasn't: If g_neg and g_pos work for all
unsigned argument values, then f_unsigned does too.

You can't say the same for f_int: f_int will not work for all int
arguments even if g_int does the same. Rewriting f_int so that it
does would be painful.

But ok, so we're not greedy about that last bit of magnitude; rather
than change the code of f_int we document its limitations:

void f_int(int val1, int val2)
/*
-INT_MAX/2 <= val1 <= INT_MAX/2
-INT_MAX/2 <= val2 <= INT_MAX/2
*/

That could work (not that I've ever seen it done). But then g_int
might have a similar documented limitation, and we'd be down to
INT_MAX/4. In a worst-case scenario you would lose not just a few
bits of magnitude, but one bit per link in the computation chain. A
single recursive function could conceivably chew up the entire
magnitude. Not a very likely occurrance, but if you don't document
legal argument ranges meticulously, how would you know it hasn't
happened?

I find it slightly easier to write correct code with unsigned. YMMV.

- Anders

--
Anders Munch. Software Engineer, Dancontrol A/S, Haderslev, Denmark
Capon - a Python based build tool - http://www.dancontrol.net/share/capon/

Alf P. Steinbach

ulæst,

28. apr. 2003, 15.38.3028.04.2003

til

On 27 Apr 2003 08:00:44 -0400, Thomas Richter <th...@cleopatra.math.tu-berlin.de> wrote:

>unsigned arithmetic, these differences will never be negative, and a certain
>surprise factor is likely. For example, for unsigned types
>
>a >= b and a - b >= 0
>
>are not equivalent, whereas for signed types, both are equivalent under the
>rather mild assumption that no overflow happens ("types are considered large
>enough for the data you'd need to represent).

A coder who is suprised that an unsigned difference is always
non-negative can not, IMHO, be relied on to produce otherwise
trustworthy code.

Therefore, as far I can see the above example can only occur
in a situation of no relevance to the impact of signed/unsigned
on producing trustworthy code.

Furthermore, coding the expression on the left as the expression
on the right is an invitation to disaster even for signed types.

>Strictly speaking, the "no overflow" rule is of course the same
>for unsigned, except that the overflow condition is
>true in half of the useful cases, roughly speaking. (-;

With unsigned arithmetic the expression on the right is always
true, no matter what "rather mild assumptions" are made.

Therefore, overflow is irrelevant for the equivalence between
the expressions using unsigned arithmetic: there is no
such equivalence in the first place.

Furthermore, overflow is well-defined for unsigned arithmetic
whereas it isn't for signed arithmetic; hence not "the same"
in any context.

>A typical example would be a program that needs to identify pixels on a screen:
>Event though all useful pixels are likely to be addressed by unsigned values
>in the range of the screen resolution, it is better to use signed values here since
>it allows you to represent pixel differences as well (e.g. for line drawing and
>clipping algorithms) with the same data type, and without having to be overly
>careful with automatic casting.

"Better" is an absolute judgement.

But in my experience what's best or good enough depends on
the context.

Therefore I must humbly disagree with the above statement.

>However, I have to confess that I personally prefer "unsigned" for data types
>where it looks more natural to me, e.g. counting a number of elements, encoding
>a "size" of an object, and for all kinds of "bit juggling". However, one must
>be aware that bindly casting these types is likely to cause surprises.

That seems to be correct: blind casts are likely to cause suprises, or
at least bugs.

Cheers,

- Alf

Andrea Griffini

ulæst,

28. apr. 2003, 21.56.2028.04.2003

til

On 28 Apr 2003 14:49:22 -0400, al...@start.no (Alf P. Steinbach)
wrote:

>> Using unsigned for values in this context just *hides* errors.
>
> size = -1; // unsigned => warning, signed => no warning.

I never wrote that; I wrote "size -= 3". If that would
make you feel better then write "size -= 3U"... but that
won't solve any problem.

To be precise the subtraction works and is surely
predictable (even if platform dependent)... the result
is however completely *meaningless* if you're not
working in modulo logic.
For example the size of an std::vector, in my opinion,
has almost always *nothing* to do with modulo logic,
so using an unsigned value is IMO nonsense.

I saw someone that said that if you write

void *alloc_buffer(int size);

then that function is forced to add a check to be
sure that the requested size is not negative.
I'm not sure transforming a bad call (-1, or x-y
where both x and y are unsigned and y == x+1) to a
say 4Gb request of virtual memory instead is such
a big accomplishment in that case.

Let me take this just a little step further...
suppose you have a function that requires a value
that must be only in the range 0..7. Are you then
writing something like the following ?

void foo(unsigned int x)
{
x &= 7;
// ... here we are safe ! ...
...
}

If you do then please define "safe"...

After that point *surely* x is 0..7, but that is
enough to imply that the request was correct ?
Not at all. Actually adding that masking IMO
just makes the software worse.

If your function uses only values 0..255 are you
using "unsigned char" parameters ??? That is not
so different from the example.

IMO using an unsigned as a parameter for something
that can't be negative is just thinking on this line.
My impression is this is not the path to robust software,
but of course you can have a different idea on this.
Robust software would IMO use a signed parameter and
would assert about its non-negativity.

I think that a robust program is not a program that
is just harder to kill.

Andrea

Kevin Cline

ulæst,

28. apr. 2003, 21.57.3128.04.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) wrote in message news:<fc2e0ade.03042...@posting.google.com>...

> In Section 4.4 on page 73 of "The C++ Programming Language",
> Stroustrup writes
>
> "The unsigned integer types are ideal for uses that treat storage as a
> bit array. Using an unsigned instead of an int to gain onr more bit
> to represent positive integers is almost never a good idea. Attempts
> to ensure that some values are positive by declaring variables
> unsigned will typically be defeated by the implicit conversion rules."
>
> In Section 4.10, advice item [18] is:
>
> "Avoid unsigned arithmetic"
>
> I disagree with these two statments. I think that the opposite should
> have been stated, that one should avoid *signed* arithmetic, unless
> the context of whatever is being modeled implicitly prescribes signed.
> To say that it irks me that some people actually blindly follow this
> advice would be an understatment.
>
> But before I rant, and I do intend to rant, I would like to here the
> opinions of others.
>
> Who agrees with Bjarne and who disagrees?

I agree with Bjarne, but since all my compilers now warn of
signed->unsigned conversions, I don't worry about it much anymore.
But I don't use integer indexes much either; I always use iterators.
My opinion is that the language should be changed so that
signed->unsigned conversions are no longer permitted.

Jim Melton

ulæst,

28. apr. 2003, 21.59.0728.04.2003

til

"Anders J. Munch" <ande...@dancontrol.dk> wrote in message
news:3ead34d3$0$10410$edfa...@dread11.news.tele.dk...

> "Ivan Vecerina" <iv...@myrealbox.com> wrote:
> >
> > The question is:
> > What do you win by using unsigned types ?
>
> It's easier to write code that works for the full input domain.
> Example:
>
> void g_int(int);
> void f_int(int val1, int val2)
> {
> g_int(val1 - val2);
> }
>
> Now with unsigned you can't write it that way. You have to
> specialise for val2>val1:

Actually, you need to be clear on your preconditions. What exactly does
val1-val2 represent? If you truly want the magnitude of the difference of
two numbers, then you should be taking the absolute value. Since subtraction
is not commutative, you can't arbitrarily swap the arguments of operator-.

Passing a negative number to a function that expects unsigned is an error.
The argument that Meyers makes is that use of unsigned makes it an
undetectable error inside the function.

> void g_neg(unsigned);
> void g_pos(unsigned);
> void f_unsigned(unsigned val1, unsigned val2)
> {
> if(val1 < val2)
> g_neg(val2 - val1);
> else
> g_pos(val1 - val2);
> }
>
> The unsigned case seems just more work, but f_unsigned has one
> important feature which f_int hasn't: If g_neg and g_pos work for all
> unsigned argument values, then f_unsigned does too.

The functions g_neg and g_pos still can not protect themselves against
incorrect input. You have put a burden on the calling code that is simply
not enforceable. I didn't have a strong opinion coming in to this thread,
but you are moving me over to Bjarne's side (I'm sure he'll be thrilled to
know it :-)
--
<disclaimer>
Opinions posted are those of the author.
My company doesn't pay me enough to speak for them.
</disclaimer>
--
Jim Melton
Software Architect, Fusion Programs
Lockheed Martin Astronautics
(303) 971-3846

Matvei Brodski

ulæst,

29. apr. 2003, 05.22.5829.04.2003

til

al...@start.no (Alf P. Steinbach) wrote in message news:<3eabba21...@News.CIS.DFN.DE>...

> On 26 Apr 2003 09:13:41 -0400, unorigina...@yahoo.com (Le Chaud Lapin) wrote:
>
> >In Section 4.4 on page 73 of "The C++ Programming Language",
> >Stroustrup writes
> >
> >"The unsigned integer types are ideal for uses that treat storage as a
> >bit array. Using an unsigned instead of an int to gain onr more bit
> >to represent positive integers is almost never a good idea. Attempts
> >to ensure that some values are positive by declaring variables
> >unsigned will typically be defeated by the implicit conversion rules."
> >
> >In Section 4.10, advice item [18] is:
> >
> >"Avoid unsigned arithmetic"
> >
> >... I would like to here the opinions of others.

...

>
> With that as a premise I tend to use unsigned types whenever an
> integer should not be negative. Catering to negative values by
> using a signed type would just be putting your head in the sand,
> like an ostrich (it's the same kind of reasoning that makes people
> avoid using exceptions: if we don't see an error, it doesn't exist).
> Unfortunately many do exactly that, using signed arithmetic to avoid
> seeing, and what's more, require or admonish others to follow suit.
>

Exactly. Consider the example from Scott Meyers' column:

int f();
int g();

Array<double> a(f()-g());

Scott's conclusion is that if an Array's constructor takes signed
integer (rather then an unsigned one), it gives him an extra check to
ensure that whatever returned by f() is greater then whatever returned
by g().
Now, how very odd. Consider all the possible errors the user's code
can contain. Starting with the fact that f() and g() can have an
overflow inside. I.e., the fact that f() returns something less then
g() does not mean we do not have a bug. Out of all the variety of bugs
we catch a subset that make f()-g() negative. Statistically speaking -
50%. Ok, you can say that catching some bugs are better then none. So?

The way bugs are caught in Scott's example is this: we decide to limit
the number of "correct" inputs to half of the range of all possible
inputs and then proclaim any outside number to be in error. Duh! Why
limit the "correct" range to INT_MAX numbers then? Let's limit it to
INT_MAX / 2 and we will catch 2 times more "bugs"!

So, as far as I am concerned, if something must be non-negative, make
it unsigned. Container sizes, screen coordinates, person's age or
height, etc.

John Potter

ulæst,

29. apr. 2003, 05.26.1929.04.2003

til

On 28 Apr 2003 14:43:37 -0400, Hartwig Wiesmann
<hartwig....@wanadoo.nl> wrote:

> It does not matter whether you use signed or unsigned arithmetic because
> you are accessing size+1 elements although you wanted to access "size"
> elements only - I suppose. The right condition is "--i>0" and you have
> to change the index to v[i-1] and everything works fine!

No, his code is correct for signed. Your code decrements i to size - 1 and
then uses size - 1 - 1 which skips the last item. Your loop also crashes
and burns when size is zero.

John

David Abrahams

ulæst,

29. apr. 2003, 05.26.5629.04.2003

til

Andrea Griffini <agr...@tin.it> writes:

> On 27 Apr 2003 07:00:05 -0400, terjes.@chello.no (Terje Sletteb)
> wrote:
>
> >Therefore, it doesn't really enforce that only
> >positive values are passed to it.
>
> It does just the opposite: it hides the fact that
> a negative number has been passed. So the callee
> can't even discover the problem with a check on
> the input parameters.

To look at it another way, if the permissible range of input values
has an upper limit, using unsigned means a single check for exceeding
that limit can be used to catch most likely negative values as well.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Allan W

ulæst,

29. apr. 2003, 05.27.1429.04.2003

til

ap...@student.open.ac.uk (apm) wrote

> A difference cannot be negative, by definition.

Why do you say this? Can you take 3 from 2? What should happen if
someone tries -- should we calculate 2 - 3 = 1? Should the program
crash?

Your statement is both true and false, but more false than true. It
depends on which definition you use. From dictionary.com:

6. Mathematics:
a. The amount by which one quantity is greater or less than another.
b. The amount that remains after one quantity is subtracted from another.

Guess which one applies to the - operator in C?
Guess which one is expected in *any* real high-level computer language?

Theoretically, we could make due with either definition.

Since C does use definition b (score 1 point for a correct answer),
we can still synthesize the unsigned difference by comparing to zero
and using unary- as well.

If C used definition a, we could still determine magnitude by
using <. If unary- doesn't change meaning, we could even synthesize
definition b. Beginning language students would find it confusing
that unary- can return a negative number, but binary- cannot... but
they would get used to it. We'd also suffer a (probably
insignificant) loss of efficiency -- I don't think there are any
CPUs that implement a subtract operator that always returns a
positive number, so the generated code would have to do the very
same synthesis I talked about above (check sign, negate if neccesary).

In either model, the issue of INT_MIN would be a problem.

Practically, even with teaching and efficiency issues ignored, it
simply makes more sense for the subtraction operator to return a
signed value. In most cases we already know what the sign will be;
if we wish the result to be positive, we arrange for the larger
number to be on the left. In cases where we don't know in advance
which number will be larger, more often than not we need to find
out. Currently, code can find out by checking the result of the
difference. With the alternate definition of difference, we would
require an extra test.

Alf P. Steinbach

ulæst,

29. apr. 2003, 05.29.3629.04.2003

til

(I'm not sure whether this discussion merits a new
thread; it's now tangential to the original issue.)

On 28 Apr 2003 21:56:20 -0400, Andrea Griffini <agr...@tin.it> wrote:

>On 28 Apr 2003 14:49:22 -0400, al...@start.no (Alf P. Steinbach)
>wrote:
>
>>> Using unsigned for values in this context just *hides* errors.
>>
>> size = -1; // unsigned => warning, signed => no warning.
>
>I never wrote that; I wrote "size -= 3".

Right. The previous posting went like this,
except the explanatory brackets:

[context] > size -= 3;
>
[technical discussion] > ...
>
[concluding statement] > Using unsigned for values

> in this context just *hides*
> errors.

[counter-example]

size = -1; // unsigned => warning, signed => no warning.

>If that would make you feel better then write

>"size -= 3U"... but that won't solve any problem.

Writing "3U" instead of "3" will in this case probably
not solve any problem.

>...

>I saw someone that said that if you write
>
> void *alloc_buffer(int size);
>
>then that function is forced to add a check to be
>sure that the requested size is not negative.

If the function's contract is to range-check its argument
in some way, then, by definition of "contract", it should
do so regardless of the physical type of argument.

If, on the other hand, its contract doesn't include range
checking, then it doesn't need to do any range-checking.

Note that it can do meaningful range checking regardless of
whether the argument type is signed int or unsigned int --
the valid range could, for example, be 0 through
std::numeric_limits<int>::max() in both cases, and both
cases then has some range of potential but disallowed values.

Note that it can also do without range checking regardless of
the type of the argument, in which case a breach of contract
from the client leads to undefined behavior for both types.

However, for the unsigned case the contract can be made so as
to be impossible to breach.

In the signed int case it will, on the other hand, always be
possible to pass an invalid argument value (namely, negative).

In summary, it is possible to select a contract such that both
signed and unsigned have the exact same pro's and cons's; it is
possible to select a contract (all possible values allowed and
meaningful) which only unsigned can fulfill; it is not possible,
as far as I can see, to select a contract where signed has any
advantage (but that regards this _particular_ example function).

>I'm not sure transforming a bad call (-1, or x-y
>where both x and y are unsigned and y == x+1) to a
>say 4Gb request of virtual memory instead is such
>a big accomplishment in that case.

Presumably this refers to the contract of all values allowed
and meaningful?

In that case the client code isn't technically in breach of
contract wrt. the function, but the client code contains a bug
that causes it to request an incorrect amount of memory.

Note that with unsigned argument type (as this contract requires)
a good compiler will indicate the bug by issuing a warning; hence,
the bug can be caught at compile time instead of run-time.

With signed argument type this same client-code bug would slip
undetected through compilation, only to be detected at run-time.

What's best: to hunt it down then, or catch it at compile time?

>Let me take this just a little step further...

I'm sorry that I fail to see the connection with the above,
but I'll try to answer as best I can anyway.

>suppose you have a function that requires a value
>that must be only in the range 0..7. Are you then
>writing something like the following ?
>
> void foo(unsigned int x)
> {
> x &= 7;

That depends on whether the requirement is for the
value passed in to the function (in which case it's
probably not a good idea) or for some value the function
uses internally (in which case, whether it can be a good
idea depends on the details of the requirement).

> // ... here we are safe ! ...

IMHO such a comment might be very misleading.

> ...
> }

>...

>Actually adding that masking IMO just makes the
>software worse.

In my experience whether something is better or worse,
good enough or not, depends very much on the context.

In this case, the context would be the function's contract
and purpose, which is not specified; an example where the
masking could be relevant could be a function that uses
the three lower bits as a distinct part of a multi-part
value packed in an int (this would be at a very low level).

Therefore, I must humbly disagree with the above statement.

Hth.,

- Alf

nobody

ulæst,

29. apr. 2003, 05.32.1429.04.2003

til

Le Chaud Lapin wrote:
> In Section 4.4 on page 73 of "The C++ Programming Language",
> Stroustrup writes
>
> "The unsigned integer types are ideal for uses that treat storage as a

> bit array. Using an unsigned instead of an int to gain one more bit

> to represent positive integers is almost never a good idea. Attempts
> to ensure that some values are positive by declaring variables
> unsigned will typically be defeated by the implicit conversion rules."
>
> In Section 4.10, advice item [18] is:
>
> "Avoid unsigned arithmetic"
>

> I disagree with these two statments. I think that the opposite should
> have been stated, that one should avoid *signed* arithmetic, unless
> the context of whatever is being modeled implicitly prescribes signed.
> To say that it irks me that some people actually blindly follow this
> advice would be an understatment.
>

> But before I rant, and I do intend to rant, I would like to here the
> opinions of others.

Dr. Stroustrup is right and you are wrong.

Consider the following:

int i = -1;
int j = 0;
unsigned int u = 0U;
unsigned int v = 1U;

Then all of the following evaluate to true:

i < j

j == u

u < v

but perversely, so does the following:

v < i

For me the fact that an unsigned int may be _less_ than a
negative int settles the issue that unsigned int is _not_
in any sense a reasonable approximate model ordinary integer
arithmetic. As others on this thread have pointed out, all
sorts of standard integer manipulations fail to work on
expressions involving unsigned ints. Mixing signed int and
unsigned int in comparisons is especially deadly. Fortunately,
many compilers now warn for this, and it's a warning that I
always heed.

In my opinion, the integral typing and promotion rules are
one thing that C and C++ got wrong and Pascal got right. In
the latter there was only INTEGER and subranges thereof; in
expressions everything promotes to INTEGER, and comparisons
work correctly with no surprises, even when one mixes
types constrained to a non-negative subranges with ones that
can be negative. Of course, the same is true in C or C++ on
platforms where sizeof(char) < sizeof(short) < sizeof(int)
if one mixes signed/unsigned char/short with int, and just
avoids unsigned int. I just wish that were not platform
dependent.

nobody

John Potter

ulæst,

29. apr. 2003, 08.08.5329.04.2003

til

On 28 Apr 2003 09:00:51 -0400, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

> "Ivan Vecerina" <iv...@myrealbox.com> wrote in message news:<3eaa...@news.swissonline.ch>...

> > One typical example is:

> > for( unsigned i = v.size() ; --i >= 0 ; )
> > process( v[i] );

> for( unsigned i = v.size() ; --i; )
> process( v[i] );

The first never ends (as intended to show problem) and the last ends
too soon (missing v[0]), proving the difficulty. The last can also
take a long time when size is zero if there is no segfault.

for (unsigned i = v.size(); i; ) {
-- i;
process(v[i]);
}

The difficulty of getting down loops right with unsigned is the
same as with pointers into arrays and iterators. That is why
the standard has reverse_iterator to make things easier to get
right.

John

Anders J. Munch

ulæst,

29. apr. 2003, 08.24.0029.04.2003

til

"Jim Melton" <jim.m...@lmco.com> wrote:
> "Anders J. Munch" <ande...@dancontrol.dk> wrote in message
>

> Actually, you need to be clear on your preconditions. What exactly does
> val1-val2 represent? If you truly want the magnitude of the difference of
> two numbers, then you should be taking the absolute value. Since subtraction
> is not commutative, you can't arbitrarily swap the arguments of operator-.

Nor did I. I differentiated g_int into g_neg and g_pos for this
purpose. g_neg must take the sign reversal into account.

Duplicating g appears cumbersome but isn't really, because you will so
often want different behaviour for the negative case anyway.

- Anders

LLeweLLyn

ulæst,

29. apr. 2003, 09.16.1229.04.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) writes:

> In Section 4.4 on page 73 of "The C++ Programming Language",
> Stroustrup writes
>
> "The unsigned integer types are ideal for uses that treat storage as a

> bit array. Using an unsigned instead of an int to gain onr more bit

> to represent positive integers is almost never a good idea. Attempts
> to ensure that some values are positive by declaring variables
> unsigned will typically be defeated by the implicit conversion
> rules."

Specifically, negative signed values will be silently transumuted to
high unsigned values, because of the implicit conversion from
signed to unsigned.

There's an additional problem with unsigned arithmetic: it isn't the
natural number arithmetic we are used to. It's modular
arithmetic. (See 3.9/4) In natural number arithmetic, 3 - 5 is an
error - negative numbers aren't in the set of natural numbers. In
unsigned arithmetic, 3 - 5 has an implementation-dependent but
well-defined result - no compile or runtime error is
triggered. And since that result isn't negative, unsigned numbers
don't model integers either.

So unsigned numbers present two kinds of traps:

(a) Implicit conversions.

(b) They don't model natural numbers, integers, or any other
number category most people have everyday experience with.

It's possible that if one of these traps were removed, the remaining
one would not be sufficient reason to avoid unsigned arithmetic,
and we could use them to document that, for example, a negative
size makes no sense for container. However, I doubt there is a way
to remove or reduce either (a) or (b) for C++.

> In Section 4.10, advice item [18] is:
>
> "Avoid unsigned arithmetic"
>
> I disagree with these two statments. I think that the opposite should
> have been stated, that one should avoid *signed* arithmetic, unless
> the context of whatever is being modeled implicitly prescribes
> signed.

In my experience such an opinion comes from the (mistaken) notion that
unsigned numbers model natural numbers, possibly combined with
ignorance of how troublesome the implicit conversion rules really
are.

In practice, you can't avoid signed arithmetic - it's too convient,
too frequently used by other programmers - and, realisticly, a
better model for many (not all) of things we measure numerically.

> To say that it irks me that some people actually blindly follow this
> advice would be an understatment.

[snip]

I don't follow Bjarne's advice blindly; for years I disagreed with
this item strongly - perhaps as strongly as you do. Heavy use of
containers that provided size as unsigned (the STL containers
amoung them, but not primarily; STL iterator idioms make the size
functions almost unecessary) taught me differently, the hard
way. I spent a good many hours tracking down bugs caused by the
(a) and (b) traps I mention above. Now I know better. I hope :-)

Andrea Griffini

ulæst,

29. apr. 2003, 10.32.2429.04.2003

til

On 29 Apr 2003 05:26:56 -0400, David Abrahams
<da...@boost-consulting.com> wrote:

>To look at it another way, if the permissible range of input values
>has an upper limit, using unsigned means a single check for exceeding
>that limit can be used to catch most likely negative values as well.

I've used that tecnique in the past, but only
when writing in assembler. I'm not sure that I would
appreciate a C++ program doing that kind of low-level
optimization. By the way it's not that absurd to
expect a good optimizing compiler being able to
discover this kind of quite common test (x>=0 && x<n)
and traslating it in the optimized machine instructions
that logically implement ((unsigned)x < n).

May be I'm wrong... but the objection made was that
using an unsigned you need no test at all.
That position is in my opinion hardly defendable
(if your target is actually writing a program that
works... if your target is just finding someone to blame
instead then using unsigned parms sounds just *great*).

Andrea

Matvei Brodski

ulæst,

29. apr. 2003, 10.33.1529.04.2003

til

David Abrahams <da...@boost-consulting.com> wrote in message news:<u65oyt...@boost-consulting.com>...

> Andrea Griffini <agr...@tin.it> writes:
>
> > On 27 Apr 2003 07:00:05 -0400, terjes.@chello.no (Terje Sletteb)
> > wrote:
> >
> > >Therefore, it doesn't really enforce that only
> > >positive values are passed to it.
> >
> > It does just the opposite: it hides the fact that
> > a negative number has been passed. So the callee
> > can't even discover the problem with a check on
> > the input parameters.
>
> To look at it another way, if the permissible range of input values
> has an upper limit, using unsigned means a single check for exceeding
> that limit can be used to catch most likely negative values as well.

Ditto. Actually, since we are talking about a modulo arithmetic, all
we can say is that by choosing a range of acceptable inputs that is
smaller then the range of all possible inputs, we can check for bugs
that (by chance!) result in the "un-acceptable" input. Probability of
catching a bug depends on the size of the choosen range, but not on
its location. If it is located "at the edge" of the "all possible
values range" (as is when we use unsigned) then we can check with one
comparison. When it is in the middle (when we use int), we have to use
two comparisons. But the effect is the same in both cases.

Dave Harris

ulæst,

29. apr. 2003, 14.45.3229.04.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) wrote (abridged):

> > And I often find it convenient to use negative numbers to
> > represent errors or not-found or other information in-band.
>
> And so we have it. This is most likely where the "problem" began.

I don't think so. It was the last and least of the reasons I gave. It's
more that, since we already have a signed value, we can use a constant as
sentry. Sometimes this is more convenient than a non-constant. For
example, compare:

bool containsItem = (find( array, 0, 100, item ) < 0);

with:

bool containsItem = (find( array.begin(), array.end(), item )
!= array.end());

In the first code, the result of find() is self-contained. In the second,
we need to use array again to make sense of it.

> A C programmer might have had a function whose return value should be
> inherently unsigned:
>
> unsigned int strlen();

But this is using unsigned int as if it were an int constrained to be
unsigned! Unfortunately the C/C++ rules do not support that usage well.

> This is a *critical* departure from regularity.

Which is unavoidable, assuming we don't want to use exceptions. We might
represent that irregularity more explicitly, for example with:

struct Result {
int value;
bool isError;
};

Or perhaps with a class which prevents the value being accessed when the
isError flag is true. Or whatever. We clearly have two distinct values,
and the use of the sign bit is a way of packing both values into a single
quantity. Packing is more efficient, but I hesitate to call it premature
optimisation because it is also less work.

Even with an explicit Result pair, I'd still use a signed int for the
value, for reasons mentioned in my earlier article and elsewhere in this
thread. Unsigned is best reserved for, eg, when you need guaranteed
behaviour on arithmetic overflow and underflow.

> int foo ()
> {
> ....
> return strlen(s);
> ....
> }
>
> Then said, "oh..no problem, I will just changed the return type of
> strlen"

I doubt it. If we follow the Result approach, the code would become like:

Result foo() {
return Result( strlen(s), true );
}

In other words, an explicit conversion. I doubt we would change strlen()
to return a Result. In the original code, there is probably no need to do
anything, since the implicit conversion is pretty good, but for safety

return boost::numeric_cast<int>( strlen(s) );

would defend against strlen returning a result too big for int. (Which
possibility again demonstrates that unsigned int is not just a Pascal-like
subrange of int.)

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Alf P. Steinbach

ulæst,

29. apr. 2003, 14.46.5129.04.2003

til

On 29 Apr 2003 09:16:12 -0400, LLeweLLyn <llewe...@xmission.dot.com> wrote:

>unorigina...@yahoo.com (Le Chaud Lapin) writes:
>
> > In Section 4.4 on page 73 of "The C++ Programming Language",
> > Stroustrup writes
> >
> > "The unsigned integer types are ideal for uses that treat storage as a
> > bit array. Using an unsigned instead of an int to gain onr more bit
> > to represent positive integers is almost never a good idea. Attempts
> > to ensure that some values are positive by declaring variables
> > unsigned will typically be defeated by the implicit conversion
> > rules."
>
>Specifically, negative signed values will be silently transumuted to
> high unsigned values, because of the implicit conversion from
> signed to unsigned.

Let us not forget that for many (nearly all) C++ implementations
modulo arithmetic is also used for signed integers.

To maintain that well-defined is a problem, whereas not well-defined
(signed arithmetic) but in practice the same behavior, should be a
solution, well, with all respect, I fail to see the logic in that.

>There's an additional problem with unsigned arithmetic: it isn't the
> natural number arithmetic we are used to. It's modular
> arithmetic. (See 3.9/4)

That isn't "additional", it's the reason for the conversion rules.

And I for one don't see it as a "problem", but as a feature... ;-)

> In natural number arithmetic, 3 - 5 is an
> error - negative numbers aren't in the set of natural numbers. In
> unsigned arithmetic, 3 - 5 has an implementation-dependent but
> well-defined result - no compile or runtime error is
> triggered. And since that result isn't negative, unsigned numbers
> don't model integers either.

Let's not forget that for many (nearly all) C++ implementations the
same general behavior is exibited by signed arithmetic, but with
one crucial difference: for signed arithmetic this behavior is not
specified by the standard, and so is not universally portable.

Unsigned arithmetic: modulo arithmetic by definition.

Signed arithmetic: modulo arithmetic in practice, but not by definition
(can that really be _better_?).

>So unsigned numbers present two kinds of traps:

(a) and (b) below are not "two kinds" but the same, namely
modulo arithmetic.

(a) and (b) below are IMO not "traps".

Most people have, I would presume, everyday experience with clocks and
degrees of angle, both of which are based on modulo arithmetic.

> (a) Implicit conversions.
>
> (b) They don't model natural numbers, integers, or any other
> number category most people have everyday experience with.

For a typical C++ implementation, signed integers present three problems
to the novice programmer:

(a) Implicit conversions (herein included promotions).

(b) Modulo arithmetic.

(c) That (b) is not specified by the standard.

But as I and others have written elsewhere in this thread, the technical
"problems" of integer arithmetic -- signed or unsigned -- are not
really the most important issue in selecting a type to use.

The issue is instead, to balance the needs of conveying information to
the programmer (preventing bugs, in many cases catching them at compile
time) versus supporting ease of firewall programming and debugging.
This balancing act will in general require intelligent, context-dependent
decisions. And so I don't believe "signed/unsigned is bad" is good.

Cheers,

- Alf

John Potter

ulæst,

29. apr. 2003, 14.48.4229.04.2003

til

On 27 Apr 2003 06:53:34 -0400, "Ivan Vecerina" <iv...@myrealbox.com> wrote:

> The fact is that using unsigned instead of signed types
> can create bugs -- as you lose information.

> One typical example is:
> for( unsigned i = v.size() ; --i >= 0 ; )
> process( v[i] );

It is amusing that this example could be used as the reason to
follow standard stl usage when writing loops.

for (int i = v.size() - 1; i != -1; -- i)
process(v[i]);

Works just fine. Oops, I just learned that subscripts should use
size_type.

for (vector<T>::size_type i = v.size() - 1; i != -1; -- i)
process(v[i]);

Still works. Whatever unsigned type is used for size_type, -1 is
0 - 1 and everything is beautiful.

Almost as nice as iterator and reverse_iterator. Also shows that
subscripts are superior to iterators for use by less than experts.

John

E. Mark Ping

ulæst,

29. apr. 2003, 14.56.0329.04.2003

til

In article <3ead62e1....@News.CIS.DFN.DE>,

Alf P. Steinbach <al...@start.no> wrote:
>size = -1; // unsigned => warning, signed => no warning.

Yes, many compilers do this, and it's incredibly annoying. The
standard says (in 4.7/2):

"If the destination type is unsigned, the resulting value is the least
unsigned integer congruent to the source integer (modulo 2^n where n
is the number of bits used to represent the unsigned type)."

Hence,

unsigned int val = -1;

Is a perfectly valid way to initialize 'val' to the largest
representable value. That may be outdated now, since you could
equally use:

unsigned int val = std::numeric_limits<unsigned int>::max();
--
Mark Ping
ema...@soda.CSUA.Berkeley.EDU

E. Mark Ping

ulæst,

29. apr. 2003, 14.57.2829.04.2003

til

In article <bd47bb0e.03042...@posting.google.com>,
KIM Seungbeom <musi...@bawi.org> wrote:
>Furthermore, I would encourage writing 'i > 0' or 'i != 0' instead of
>just 'i' for a condition, for it is not intuitively seen as a boolean.

This is slightly off-topic, but why? In C and C++, it is a common to
use 'x' to mean 'x != 0', the effects are known and well-defined.

David Barto

ulæst,

29. apr. 2003, 14.58.2129.04.2003

til

nobody <clcppm...@this.is.invalid> wrote in message news:<200304290647...@localhost.localdomain>...

> Le Chaud Lapin wrote:
> > In Section 4.4 on page 73 of "The C++ Programming Language",

My problem with all of this is that I consider the \model/ I am trying
to use before I consider the \type/ I use.

If I am modeling temperature, and using degrees Kelvin, then unsigned
make since as you can't have negative degrees Kelvin, while with both
degrees C and F you can. If modeling skyscrapers, floors are unsigned
and height is signed, as you can dig below ground (or sea) level.

Just to consider the extra bit you get from unsigned is not correct. Further
note that the std::string::length() is unsigned, and that the std::string::npos
is a representation of -1. If your string is so long that its length
matches npos, then you should be thinking of a new way to contain that
thing, which is most likely not really a string any more.

Of course, for simple code, I use just plain int, but when it matters
to the model I'm trying to work with, the signed/unsigned value makes
a difference.

You can't have negative a negative index in an array, or a negative length
of a string. Making these unsigned catches 'bad things' by causing the program
to core dump, and leaves that trace you can use to find why it happened.

A signed value as an index in an array can give you a valid offset which could
point to something on the stack before the array begins:

int a;
int b[3];

return(b[-1]);

If the array index is unsigned, you get a core dump.

Design the model, then pick the types.

David Barto
barto at visionpro dot com

Thomas Mang

ulæst,

29. apr. 2003, 14.58.4029.04.2003

til

Allan W schrieb:

>
> Practically, even with teaching and efficiency issues ignored, it
> simply makes more sense for the subtraction operator to return a
> signed value. In most cases we already know what the sign will be;
> if we wish the result to be positive, we arrange for the larger
> number to be on the left. In cases where we don't know in advance
> which number will be larger, more often than not we need to find
> out. Currently, code can find out by checking the result of the
> difference. With the alternate definition of difference, we would
> require an extra test.

Returning a signed value has advantages and drawbacks:

a) no need to check which argument is greater
b) possible underflow

Take this example:

int a = -20;
int b = std::numeric_limits<int>::max();

int c = a - b; // undefined, underflow because distance is <
std::numeric_limits<int>::min()

Now take 2 unsigned numbers:
Here, one has to check which one is greater, but then no underflow is possible:

unsigned int a = 40;
unsigned int b = std::numeric_limits<int>::max();
unsigned int distance = a > b ? a - b : b - a;

Now as an exercise write a (portable) test that checks the former example for
possible underflow / overflow.
Whatever you come up with, it will probably be more complex than the test for
unsigned types.

Note also that for unsigned types, overflow/underflow is defined by using modulo,
whereas for signed it's undefined.

best regards

Thomas

chris jefferson

ulæst,

29. apr. 2003, 15.04.4529.04.2003

til

"Andrea Griffini" <agr...@tin.it> wrote in message
news:b9osavkbhkrberksf...@4ax.com...

> On 29 Apr 2003 05:26:56 -0400, David Abrahams
> <da...@boost-consulting.com> wrote:
>
> >To look at it another way, if the permissible range of input values
> >has an upper limit, using unsigned means a single check for exceeding
> >that limit can be used to catch most likely negative values as well.
>
> I've used that tecnique in the past, but only
> when writing in assembler. I'm not sure that I would
> appreciate a C++ program doing that kind of low-level
> optimization. By the way it's not that absurd to
> expect a good optimizing compiler being able to
> discover this kind of quite common test (x>=0 && x<n)
> and traslating it in the optimized machine instructions
> that logically implement ((unsigned)x < n).

What are you assuming (unsigned)(-1), or other values are.. in ny system I
can think of some negative values would have to get mapped to a positive
integer <n.

Anders J. Munch

ulæst,

29. apr. 2003, 15.05.2229.04.2003

til

"Le Chaud Lapin" <unorigina...@yahoo.com> wrote:

> "Ivan Vecerina" <iv...@myrealbox.com> wrote:
> > for( unsigned i = v.size() ; --i >= 0 ; )
> > process( v[i] );
>

> for( unsigned i = v.size() ; --i; )
> process( v[i] );

Yuch, for loop conditions with side-effects. Any shop where that
qualifies as typical code has bigger problems than signed v. unsigned.

- Anders

Alf P. Steinbach

ulæst,

30. apr. 2003, 06.02.1130.04.2003

til

On 29 Apr 2003 14:56:03 -0400, ema...@soda.csua.berkeley.edu (E. Mark Ping) wrote:

>In article <3ead62e1....@News.CIS.DFN.DE>,
>Alf P. Steinbach <al...@start.no> wrote:
>>size = -1; // unsigned => warning, signed => no warning.
>
>Yes, many compilers do this, and it's incredibly annoying.

Nah, such warnings are exactly what I crave.

Except -- when they're generated for standard library source code... :-(

>...

>unsigned int val = -1;
>
>Is a perfectly valid way to initialize 'val' to the largest
>representable value.

Yes yes yes, it is; the technical aspect is not in question.

But from an engineering point of view:

>That may be outdated now, since you could equally use:
>
>unsigned int val = std::numeric_limits<unsigned int>::max();

Or, e.g. (which I like better, but that's personal taste),

unsigned val = static_cast<unsigned<( -1 );

Or whatever that conveys the _intent_.

This is not a recommendation to add a cast wherever there is a warning.

Instead, it is a recommendation to heed those warnings, take a close
look at the offending code, and only cast away the warning as a concious,
well-informed decision -- and regard code that produces any warnings
at all, at the highest warning level, as suspicious, and ditto for code
that contains a more than minimum number of casts.

Cheers,

- Alf

Matthew Collett

ulæst,

30. apr. 2003, 06.16.1430.04.2003

til

In article <ccd2e6e6.03042...@posting.google.com>,
ba...@visionpro.com (David Barto) wrote:

> If I am modeling temperature, and using degrees Kelvin, then unsigned
> make since as you can't have negative degrees Kelvin, while with both
> degrees C and F you can. If modeling skyscrapers, floors are unsigned
> and height is signed, as you can dig below ground (or sea) level.
>

> David Barto

Might you want to take the difference of two temperatures expressed in
degrees Kelvin? Certainly you might. Do you want the result expressed
as a signed quantity? Certainly you do. So your initial values must be
signed. (Since temperature is a continuous quantity, double would be an
even better choice, but that is not the point here.)

Saying "All these numbers are positive, therefore therefore they should
be represented as unsigned int." is exactly analogous to saying "All
squares are rectangles, therefore class Square should derive from class
Rectangle.": it makes the error of looking at _values_, when what
actually matters is _behaviour_.

Best wishes,
Matthew Collett

--
Those who assert that the mathematical sciences have nothing to say
about the good or the beautiful are mistaken. -- Aristotle

Ken Hagan

ulæst,

30. apr. 2003, 08.39.5730.04.2003

til

David Barto wrote:
>
> My problem with all of this is that I consider the \model/ I am trying
> to use before I consider the \type/ I use.
>
> If I am modeling temperature, and using degrees Kelvin, then unsigned
> make since as you can't have negative degrees Kelvin, while with both
> degrees C and F you can. If modeling skyscrapers, floors are unsigned
> and height is signed, as you can dig below ground (or sea) level.

Negative kelvins turn up in some situations with population inversion.
Whilst nobody is re-writing thermodynamics, negative values for the
"T" in the equations gives the right answers.

Similarly, Richard Feynman wrote an amusing essay where he extended
the possible values of proibability from the stuffy old [0,1] to all
the real numbers. Again, it simplified calculations to permit such
things, even if they were "unphysical" as end-results.

I see no reason not to have floor -1. I'm pretty sure I've used
lifts which advertised precisely that.

> Design the model, then pick the types.

Agreed, but don't expect other programmers to share your notion of
what the model "ought" to be. Do expect those who are unusually
talented, unusually stupid, or working in unusual domains, to bend
the rules.

I think all this thread proves is that integer types in C/C++ are
a horrible mess and languages like Ada and Pascal get it right.
Sadly, it is too late to change C or C++, so we all have to use
compiler warnings instead.

Le Chaud Lapin

ulæst,

30. apr. 2003, 09.24.0130.04.2003

til

For the record, I am pro-unsigned.

nobody <clcppm...@this.is.invalid> wrote in message news:<200304290647...@localhost.localdomain>...

> Dr. Stroustrup is right and you are wrong.

How interesting. I sent Bjarne a very long email message on the
signed/unsigned subject a while back. Here's how his response began:

"What you say, sounds logical. maybe it even is logical, but that's
not
the point. I have just seen too many examples of unsigned misused in C
and
C++ to feel comfortable with any C or C++ program that uses unsigned
in
comparisons." - Bjarne Stroupstrup

Hmm....so I guess practice takes precedence over principle.

nobody <clcppm...@this.is.invalid> wrote:
> Consider the following:
>
> int i = -1;
> int j = 0;
> unsigned int u = 0U;
> unsigned int v = 1U;
>
> Then all of the following evaluate to true:
>
> i < j
>
> j == u
>
> u < v
>
> but perversely, so does the following:
>
> v < i
>
> For me the fact that an unsigned int may be _less_ than a
> negative int settles the issue that unsigned int is _not_
> in any sense a reasonable approximate model ordinary integer
> arithmetic.

As Alf and others have pointed how, a good engineer/architect never
loses sight of the fact that all things must be considered in context.
What are the things you are comparing? Are they integers as a
mathmetician might see them? Are they real numbers? Whole Numbers?
Counting numbers? Is a double discrete or continuous? Can sqrt()take a
negative value? Are you sure? How would a 7-year-old answer this
question? What would Fourrier or Dirac say? If the voltage on one of
the MOSFETS representing the bit on one of your signed int's suddenly
drops to zero due to electromagnetic disturbance from the cat, do you
throw and exception? Here, allow me:

if (/*cat hair causes electrostatic discharge*/)
throw Good_Grief_That_Damn_Cat_Again; // Wow! Now this code is
rubust!

I am being facetious here, but you get the point. We could go on
forever like this trying to catch (at run-time, mind you) all defects
in our system. Or we can assume that the world is perfect and do no
error checking at all. But whatever we do, we have balance and let
insight be our guide. And the ability of an architect to know where
to draw the line depends on how global his perspective is of the
system he is creating, the primitives that comprise it, and judicious
choice of those primitives, how they are expected to interrelate.

So are you the type of engineer who would violate the virtue of one of
your most precious primitives to accommodate the potential
indiscretion of another undisclipned engineer? If so, there is a
price you pay that is far higher than an extra CPU cycle. It is an
insidious degradation in the elegance and purity of your system.

There is one thing that all the examples given by the pro-signed folks
have in common: They illustrate that someone somewhere (an engineer,
*not* a user) is doing something they should not have being doing in
the first place, and instead of dealing with the problem at its
source, you allow a bit of the "evil juice" to flow into your target
component. But of course, this mode of thinking is not limited to
engineers. It's also subscribed to by old women with poor eating
habits:

http://www.poppyfields.net/poppy/songs/oldwoman.html

-Chaud Lapin-

James Kanze

ulæst,

30. apr. 2003, 18.49.2330.04.2003

til

John Potter <jpo...@falcon.lhup.edu> wrote in message
news:<35crav8uauc25t3u0...@4ax.com>...

[...]

> The difficulty of getting down loops right with unsigned is the same
> as with pointers into arrays and iterators. That is why the standard
> has reverse_iterator to make things easier to get right.

And the solution (signed or unsigned) has always been to do what
reverse_iterator does internally:

for ( int i = top ; i > 0 ; -- i ) {
use( i - 1 ) ;
}

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, Tél. : +33 (0)1 30 23 45 16

KIM Seungbeom

ulæst,

30. apr. 2003, 21.48.3630.04.2003

til

ema...@soda.csua.berkeley.edu (E. Mark Ping) wrote in message news:<b8mano$eno$1...@agate.berkeley.edu>...

> In article <bd47bb0e.03042...@posting.google.com>,
> KIM Seungbeom <musi...@bawi.org> wrote:
> >Furthermore, I would encourage writing 'i > 0' or 'i != 0' instead of
> >just 'i' for a condition, for it is not intuitively seen as a boolean.
>
> This is slightly off-topic, but why? In C and C++, it is a common to
> use 'x' to mean 'x != 0', the effects are known and well-defined.

It's a matter of style.

In this case, i is not a true/false value in itself; it doesn't mean
a true/false thing, it's just a number, and the end value happens to
be the number zero. Zero here is not a singular value (as in pointers);
it is nothing different from other numbers in the range. If the end
value were not zero, we would have to write an explicit comparision
like i!=1 anyway. Then why treat it specially?

--
KIM Seungbeom <musi...@bawi.org>

Artem Livshits

ulæst,

30. apr. 2003, 21.52.5730.04.2003

til

Matthew Collett <m.co...@auckland.ac.nz> wrote in message news:<m.collett-FE20A...@lust.ihug.co.nz>...

> In article <ccd2e6e6.03042...@posting.google.com>,
> ba...@visionpro.com (David Barto) wrote:
>
> > If I am modeling temperature, and using degrees Kelvin, then unsigned
> > make since as you can't have negative degrees Kelvin, while with both
> > degrees C and F you can. If modeling skyscrapers, floors are unsigned
> > and height is signed, as you can dig below ground (or sea) level.
> >
> > David Barto
>
> Might you want to take the difference of two temperatures expressed in
> degrees Kelvin? Certainly you might. Do you want the result expressed
> as a signed quantity? Certainly you do. So your initial values must be
> signed.

Why? The absolute values and difference values are in different units,
so it's natural to express them using different types.

E.g.

absolute_value - absolute_value = difference_value
difference_value + difference_value = difference_value
difference_value - difference_value = difference_value
absolute_value + difference_value = absolute_value (if in range)
absolute_value - difference_value = absolute_value (if in range)
absolute_value + absolute_value doesn't make sense
etc.

It definitely can happen that absolute_value is better be expressed as
an unsigned type while difference_value must have a signed type.

Something similar happens with pointer arithmetic -- the difference
between 2 pointers is a signed integer (ptrdiff_t), but it doesn't
mean that pointers must be signed or something.

Artem Livshits,
Brainbench MVP for C++
http://www.brainbench.com

nobody

ulæst,

30. apr. 2003, 22.03.2130.04.2003

til

Le Chaud Lapin wrote:
> For the record, I am pro-unsigned.

That, of course, is no surprise :-)

For the record, I completely agree with Ken Hagan's comments:
% I think all this thread proves is that integer types in C/C++ are
% a horrible mess and languages like Ada and Pascal get it right.
% Sadly, it is too late to change C or C++, so we all have to use
% compiler warnings instead.

That, too, is no surprise :-)

> ["nobody" wrote]:

> > Dr. Stroustrup is right and you are wrong.
>
> How interesting. I sent Bjarne a very long email message on the
> signed/unsigned subject a while back. Here's how his response began:
>
> "What you say, sounds logical. maybe it even is logical, but that's
> not the point. I have just seen too many examples of unsigned misused
> in C and C++ to feel comfortable with any C or C++ program that uses
> unsigned in comparisons."

Indeed, the perversity of mixed comparisons was what I singled out
in my response. Now, I would never have a problem with unsigned as
a subrange of a larger (in the sense of "having all values and
more") integer type. That's what Pascal and Ada have, and it works
_in the manner that one expects_ from one's experience with ordinary
integer mathematics, even in the presence of implicit conversions
(integral promotions), subject to the proviso that overflow is
avoided. The responsibility of the good engineer, then, is
straightforward: make sure that intermediate values in
(sub)expressions don't fall outside the allowed range of integers,
and make sure that inputs and final results fall within the range
that make sense for the data you are trying to model.

Now, the unsigned int type in C or C++ (and unsigned short or even
unsigned char on platforms where sizeof(short) == sizeof (int) or
sizeof(char) == sizeof(int) respectively) does _not_ behave as a
subrange of a larger integer type. The arithmetic is explicitly
defined to be _modular_ ... which leads to the problem that
operator < and operator > no longer function as in ordinary integer
arithmetic. For instance, for some unsigned u and v it is possible
that u + v < u or that u - v > u. Notice that values of unsigned
int with its modular arithmetic are analogous to points on a circle
not to points on a line, and the comparisons such as <, <=, >=, or >
have to be made with reference to a "branch cut" -- in this case the
half-line that divides UINT_MAX from 0U. This in and of itself might
not be too bad, but the problem is that C and C++ allow for integral
promotion from int to unsigned, which may not be value preserving --
and if it is not, then the meaning of comparison operators other than
== and != has silently _changed_ from its normal one.

Back in the days when I wrote commercial programs in Pascal, I always
coded modular arithmetic operations explicitly, and when comparisons
with respect to a branch cut were needed, that was explicitly coded
too (window calculations in communication protocols are an example).
The latter relied upon value-preserving conversions from the unsigned
subranges used in in the modular arithmetic into integer values, with
operators <, <=, >, and >= acting in the normal expected way, with no
surprises.

We've been taught that we should not set traps for other programmers
by defining overloaded operators that have different from normal
behaviour. The combination of non-value-preserving integral
promotions from int to unsigned and the behaviour of the comparison
operators <, <=, >, and >= for in the presence of unsigned's modular
arithmetic leads to precisely that.

> Hmm....so I guess practice takes precedence over principle.

No, not at all. We are just acknowledging that there are flaws in
the original design of the integer types in C/C++ and we are urging
people to avoid falling into the resulting traps. In particular,
people need to avoid using unsigned as if it worked the way they
expect it to instead of the way it actually does. That means:

- Don't use unsigned int parameters to stop people from passing in
negative values. They can't do that, owing to the implicit
conversion rules.

- Don't use unsigned int for variables that may need to be compared
with signed quantities, even if those variables themselves cannot
take on negative values. Otherwise integral promotion rules will
guarantee that wrong things happen.

As the last consideration should make clear, there are other factors
other than the range of values a variable can take that dictate its
data type in C++. In particular, the types of operations in which
a variable can participate have an equally large bearing. My experience
suggests to me that in most situations the correct choice will be a
signed type; that's why I agree with Dr. Stroustrup and not with you.

Incidentally, I don't have any problems with unsigned int in certain
contexts:

- quantities that represent bit masks (where logical operations make
sense but <, <=, >=, and > usually do not figure in);

- situations where modular arithmetic is acceptable and/or desirable,
but again this almost always is where <, <=, >=, and > do not figure
in, for example in this code fragment posted by John Potter:

% for (vector<T>::size_type i = v.size() - 1; i != -1; -- i)
% process(v[i]);
%
% Still works. Whatever unsigned type is used for size_type, -1 is
% 0 - 1 and everything is beautiful.

nobody

Pierre Baillargeon

ulæst,

30. apr. 2003, 22.04.0830.04.2003

til

> If so, there is a
> price you pay that is far higher than an extra CPU cycle. It is an
> insidious degradation in the elegance and purity of your system.
>
> There is one thing that all the examples given by the pro-signed folks
> have in common: They illustrate that someone somewhere (an engineer,
> *not* a user) is doing something they should not have being doing in
> the first place, and instead of dealing with the problem at its
> source, you allow a bit of the "evil juice" to flow into your target
> component.

Yes, it is something called integration. You advocate to design
components in absolute isolation, ignoring the global view. It will be
a rare program that will need only unsigned numbers. As soon as you
need to mix signed with unsigned, you are better off using a single
signed type.

The gain are so little anyway: a little range, an initially always
correct value. As soon as you add or substract them they loose their
perfection and can yield overflows and negatives. So what if unsigned
strlen() would represent better the intial concept? The first time you
begin to slice and move within the string you are bound to have the 0
limit condition wrong. Amples of post in this thread have shown that
people do the same errors again and again.

So the little gains, and cartesian ego-satisfaction, of choosing
unsigned types are instantly lost as soon as you start actually using
them. That is what Bjarne talks about in his book and the email you
quoted.

LLeweLLyn

ulæst,

30. apr. 2003, 22.20.1330.04.2003

til

al...@start.no (Alf P. Steinbach) writes:

> On 29 Apr 2003 09:16:12 -0400, LLeweLLyn <llewe...@xmission.dot.com> wrote:
>
> >unorigina...@yahoo.com (Le Chaud Lapin) writes:
> >
> > > In Section 4.4 on page 73 of "The C++ Programming Language",
> > > Stroustrup writes
> > >
> > > "The unsigned integer types are ideal for uses that treat storage as a
> > > bit array. Using an unsigned instead of an int to gain onr more bit
> > > to represent positive integers is almost never a good idea. Attempts
> > > to ensure that some values are positive by declaring variables
> > > unsigned will typically be defeated by the implicit conversion
> > > rules."
> >
> >Specifically, negative signed values will be silently transumuted to
> > high unsigned values, because of the implicit conversion from
> > signed to unsigned.
>
> Let us not forget that for many (nearly all) C++ implementations
> modulo arithmetic is also used for signed integers.

However in the case of signed integers the problem boundaries are both
relatively far from areas most values lie in. For unsigned, one
boundary - 0 is all too close.

>
> To maintain that well-defined is a problem, whereas not well-defined
> (signed arithmetic) but in practice the same behavior, should be a
> solution, well, with all respect, I fail to see the logic in that.

That's because I forgot a piece: in signed arithmetic, the boundary
cases are far away from typical values. As long as there are no
boundary cases, modulo arithmetic is quite like ordinary
arithmetic. In unsigned arithmetic, the 0 boundary is right next
door.

> >There's an additional problem with unsigned arithmetic: it isn't the
> > natural number arithmetic we are used to. It's modular
> > arithmetic. (See 3.9/4)
>
> That isn't "additional", it's the reason for the conversion rules.
>
> And I for one don't see it as a "problem", but as a feature... ;-)

> --

It's a feature iff you know how to use it. That hasn't been common
knowledge at any place I've worked (though it is in this forum).

> > In natural number arithmetic, 3 - 5 is an
> > error - negative numbers aren't in the set of natural numbers. In
> > unsigned arithmetic, 3 - 5 has an implementation-dependent but
> > well-defined result - no compile or runtime error is
> > triggered. And since that result isn't negative, unsigned numbers
> > don't model integers either.
>
> Let's not forget that for many (nearly all) C++ implementations the
> same general behavior is exibited by signed arithmetic, but with
> one crucial difference: for signed arithmetic this behavior is not
> specified by the standard, and so is not universally portable.
>
> Unsigned arithmetic: modulo arithmetic by definition.
>
> Signed arithmetic: modulo arithmetic in practice, but not by definition
> (can that really be _better_?).
>
>
>
> >So unsigned numbers present two kinds of traps:
>
> (a) and (b) below are not "two kinds" but the same, namely
> modulo arithmetic.
>
> (a) and (b) below are IMO not "traps".
>
> Most people have, I would presume, everyday experience with clocks and
> degrees of angle, both of which are based on modulo arithmetic.

Do they do arithmetic with either of these types? I had a CS professor
who assumed everyone knew degrees were modulo 360 in a test
problem. In a class of 175, 6 people (myself included) got the
answer right. I think most programmers forgot all about math on
angles the moment they left trigonometry or calculus class.

>
> > (a) Implicit conversions.
> >
> > (b) They don't model natural numbers, integers, or any other
> > number category most people have everyday experience with.
>
> For a typical C++ implementation, signed integers present three problems
> to the novice programmer:
>
>
> (a) Implicit conversions (herein included promotions).
>
> (b) Modulo arithmetic.
>
> (c) That (b) is not specified by the standard.
>
> But as I and others have written elsewhere in this thread, the technical
> "problems" of integer arithmetic -- signed or unsigned -- are not
> really the most important issue in selecting a type to use.

*shrug* I've spent enough hours tracking down bugs due to
misunderstandings about unsigned arithmetic that I don't find
your arguments convincing.

I have, however, began to wonder if the bugs I see from unsigned
arithmetic have more to do with unfamiliarity than modulo
arithmetic.

> The issue is instead, to balance the needs of conveying information to
> the programmer (preventing bugs, in many cases catching them at compile
> time) versus supporting ease of firewall programming and debugging.
> This balancing act will in general require intelligent, context-dependent
> decisions. And so I don't believe "signed/unsigned is bad" is good.

[snip]

Nor do I - but I think that using unsigned everywhere a a number 'is
not supposed to be negative' is recipe for trouble. Further, IMO
most good uses of unsigned use bitwise operations and not
arithmetic operations.

LLeweLLyn

ulæst,

1. maj 2003, 07.42.1501.05.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) writes:

> For the record, I am pro-unsigned.
>
> nobody <clcppm...@this.is.invalid> wrote in message news:<200304290647...@localhost.localdomain>...
> > Dr. Stroustrup is right and you are wrong.
>
> How interesting. I sent Bjarne a very long email message on the
> signed/unsigned subject a while back. Here's how his response began:
>
> "What you say, sounds logical. maybe it even is logical, but that's
> not
> the point. I have just seen too many examples of unsigned misused in C
> and
> C++ to feel comfortable with any C or C++ program that uses unsigned
> in
> comparisons." - Bjarne Stroupstrup
>
> Hmm....so I guess practice takes precedence over principle.

Yes. Just as the empirical techniques pioneered by Bacon, Newton, and
others take precedence over the logic techniques demonstrated by
Aristotle.

Matthew Collett

ulæst,

1. maj 2003, 07.54.2401.05.2003

til

In article <6dc54885.0304...@posting.google.com>,
a_e_li...@yahoo.com (Artem Livshits) wrote:

> > Might you want to take the difference of two temperatures expressed in
> > degrees Kelvin? Certainly you might. Do you want the result expressed
> > as a signed quantity? Certainly you do. So your initial values must be
> > signed.
>
> Why? The absolute values and difference values are in different units,
> so it's natural to express them using different types.

They are not in different _units_, but you are correct that considered
in the abstract they may well be different types.

> absolute_value - absolute_value = difference_value

> etc.

in general, I agree with you. Indeed, this naturally generalises
further to the geometrical case, where the absolute value is a point and
the difference type is a vector, and maintaining the distinction is very
important.

> It definitely can happen that absolute_value is better be expressed as
> an unsigned type while difference_value must have a signed type.

In particular however, this runs afoul of the C(++) promotion rules for
integer arithmetic. E.g. if a and b are unsigned, then so is the
difference.

IOW, in an ideal world, you would be right. With the language we
actually have, trying to do things this way results in lots of ugly
casts all over the place.

Best wishes,
Matthew Collett

--
Those who assert that the mathematical sciences have nothing to say
about the good or the beautiful are mistaken. -- Aristotle

Andrea Griffini

ulæst,

1. maj 2003, 11.02.0001.05.2003

til

On 29 Apr 2003 05:29:36 -0400, al...@start.no (Alf P. Steinbach)
wrote:

> size = -1; // unsigned => warning, signed => no warning.

I personally find this a very little accomplishment.
Sure you can find compilers that issue a warning when
you write

unsigned int x;
x = -1;

but my guess is that those who informs you of the
problem when you write

unsigned int x;
int y = -1;
x = y;

are much less.

I also think that using -1 to set up an unsigned int
is a perfectly valid use when you need that unsigned
value, but my idea is that the cases where you need
unsigned values are far less than what seems being
your opinion.

>If the function's contract is to range-check its argument
>in some way, then, by definition of "contract", it should
>do so regardless of the physical type of argument.
>
>If, on the other hand, its contract doesn't include range
>checking, then it doesn't need to do any range-checking.

This probably means that you have a different idea of
what a contract is, and how to write robust code.
I'm don't want to go in details here, but a function
"allocate_buffer" that doesn't do any parameter checking
because the parameter is an unsigned value is a function
I wouldn't like to have around. To me really seems just
nearer to a function declared as

allocate_buffer(unsigned short size);

Do you think this is even "safer" ? (I'm supposing that
here allocating (unsigned short)-1 bytes is reasonable
while (unsigned int)-1 is not).

To me

allocate_buffer(int size);

is better than

allocate_buffer(unsigned size);

that is better than

allocate_buffer(unsigned short size);

and this because the first can be checked for bad
requests, the second messes up things a bit, and
the third just pretends inventing reasonable requests
even after receiving nonsense ones; completing the
"hiding errors" approach.

Note that I said "receiving" just because I happen
to think that this behaviour is a problem of how
the function is written (declared), even if one should
probably formally classify the coercion as happening
in the calling code, and not in the function.

So if you're a lawyer and you don't really care about
ending up with a working program, but you only need to
have someone else to blame for the program not working
then definetively using an unsigned short parameter
is IMO the best option of the three.

>Note that it can do meaningful range checking regardless of
>whether the argument type is signed int or unsigned int --
>the valid range could, for example, be 0 through
>std::numeric_limits<int>::max() in both cases, and both
>cases then has some range of potential but disallowed values.

Like others have said the problem with range is that
unsigned types are not ints that can't be negative,
they're ints that just get funny positive values when
they become negative. Sure ints express funny
(undefined) behaviour too, but that happens *far* from
the normal usage and in platform-dependent areas.

If you are interested in the funny way unsigned ints
work (i.e. in the modulo arithmetic) then unsigned
ints are just great, if you need integers that can't be
negative then I think unsigned is NOT your best option;
probably an user defined type could serve you better.

>Note that with unsigned argument type (as this contract requires)
>a good compiler will indicate the bug by issuing a warning; hence,
>the bug can be caught at compile time instead of run-time.
>
>With signed argument type this same client-code bug would slip
>undetected through compilation, only to be detected at run-time.
>
>What's best: to hunt it down then, or catch it at compile time?

That only for compile-time expressions, I hope.
Or may be your compiler issues a warning for *every*
conversion from an integer to an unsigned integer
value (for example every operator[] access in
std::vector done using integers) ?

I'm also surprised of you bringing up this point
that I find almost completely nonsense.
I don't know if this signed/unsigned thing is one of
those religious wars ... I'm relatively new to C++.
For some strange reason I thought this was just a
real logical thread.

> >Let me take this just a little step further...
>
>I'm sorry that I fail to see the connection with the above,
>but I'll try to answer as best I can anyway.

Oh, you don't need to. At least not for me.
My fighting needs are better satisfied with
lightning chess and karate.

>In my experience whether something is better or worse,
>good enough or not, depends very much on the context.

>In this case, the context would be the function's contract
>and purpose, which is not specified; an example where the
>masking could be relevant could be a function that uses
>the three lower bits as a distinct part of a multi-part
>value packed in an int (this would be at a very low level).

Seems we can't agree on this. If I need a number between
0 and 7 then IMO it's better to accept an int and check
that it is in the proper range. Different would be a
function that accepts any number because it's logical to
handle any number and then for internal reasons does
that or other manipulations.

To me a function named like "allocate_buffer" doesn't
sound like a function written to accept anything as
a reasonable request.
However I'm no one for questioning the names people like
to give to functions in their programs...

>Therefore, I must humbly disagree with the above statement.

I'm amazed by the "humbly". What does it mean ?

Andrea

Francis Glassborow

ulæst,

1. maj 2003, 11.10.3101.05.2003

til

In message <m1fznz3...@localhost.localdomain>, LLeweLLyn
<llewe...@xmission.dot.com> writes

>Do they do arithmetic with either of these types? I had a CS professor
> who assumed everyone knew degrees were modulo 360 in a test
> problem. In a class of 175, 6 people (myself included) got the
> answer right. I think most programmers forgot all about math on
> angles the moment they left trigonometry or calculus class.

Then I think that actually the six of you got it wrong :-) A wheel that
rotates 720 degrees may be very different from one that has rotated
through 0 (think of your wheel being one of the cogs in an old watch, or
a pulley on a crane. I seem to remember that Quantum Physics also has
some surprisingly different views of rotation.

Actually this is a good example of what this thread is about, underlying
assumptions about the problem domain.

When doing bit twiddling I either use an unsigned integer type or a
bitset. When doing arithmetic I invariably use signed types. When doing
comparisons I get very annoyed by size_t and various size() returns from
STL containers being unsigned. They just do not work well with C & C++
integer promotion rules even though they may seem logically correct for
the purpose of measuring the size of something.

--
ACCU Spring Conference 2003 April 2-5
The Conference you should not have missed
ACCU Spring Conference 2004 Late April
Francis Glassborow ACCU

Alf P. Steinbach

ulæst,

1. maj 2003, 18.06.2601.05.2003

til

On 30 Apr 2003 22:20:13 -0400, LLeweLLyn <llewe...@xmission.dot.com> wrote:
>*shrug* I've spent enough hours tracking down bugs due to
> misunderstandings about unsigned arithmetic that I don't find
> your arguments convincing.

Are you sure that most of that time hasn't really been tracking down
bugs in _signed_ arithmetic? ;-)

But OK, what you probably mean is mixed arithmetic, or perhaps bugs
due to unspeakables like 's.length()-5 >= 0', or some such.

"Avoid unsigned arithmetic" seems much like "avoid pointers", "avoid
raw arrays", "avoid casts", "avoid printf", in short, "avoid C".

In that case I think the proper thing to do would be to use another
language, not C++ -- or improve the allocation of programmers.

My position of conveying information to the programmer assumes a
competent C++ programmer, and I'm sure that any self-limiting, avoiding
this and that C feature, will not help one little bit in the case of
incompetent programmers (besides, as long as those features are used by
the standard library, should the library also be avoided?).

>> The issue is instead, to balance the needs of conveying information to
>> the programmer (preventing bugs, in many cases catching them at compile
>> time) versus supporting ease of firewall programming and debugging.
>> This balancing act will in general require intelligent, context-dependent
>> decisions. And so I don't believe "signed/unsigned is bad" is good.
>[snip]
>
>Nor do I - but I think that using unsigned everywhere a a number 'is
> not supposed to be negative' is recipe for trouble. Further, IMO
> most good uses of unsigned use bitwise operations and not
> arithmetic operations.

Ouch.

I'm beginning to wonder whether std::bounded_int etc. could be useful.

All with bounds-checking.

In the end, perhaps the recommended subset of C++ will be functionally
equivalent to Jensen & Wirth Pascal.

No programmer, no matter the quality of the educational system, could
possibly misunderstand J&W Pascal?

Cheers,

- Alf

Gavin Deane

ulæst,

1. maj 2003, 18.10.1801.05.2003

til

> There is one thing that all the examples given by the pro-signed folks
> have in common: They illustrate that someone somewhere (an engineer,
> *not* a user) is doing something they should not have being doing in
> the first place, and instead of dealing with the problem at its
> source, you allow a bit of the "evil juice" to flow into your target
> component.

When was the last time you wrote a non-trivial piece of software
straight off bug free? If you want to deal with the problem you need
to find the problem. How do you expect to find it you never look for
it? All engineers will make mistakes sometimes. What is wrong with
looking for mistakes where they are easiest to detect, even if they
need to be fixed elsewhere? Surely that's better than building a
system on the assumption that everyone involved is perfect and then
waiting to see how it breaks. Or whether it breaks at all, it may just
silently do the wrong thing of course.

GJD

Alf P. Steinbach

ulæst,

1. maj 2003, 18.20.1601.05.2003

til

On 1 May 2003 11:02:00 -0400, Andrea Griffini <agr...@tin.it> wrote:

>On 29 Apr 2003 05:29:36 -0400, al...@start.no (Alf P. Steinbach)
>wrote:
>
>> size = -1; // unsigned => warning, signed => no warning.
>
>I personally find this a very little accomplishment.

Could you clarify that, please? (I'm stumped.)

>Sure you can find compilers that issue a warning when
>you write
>
> unsigned int x;
> x = -1;
>
>but my guess is that those who informs you of the
>problem when you write
>
> unsigned int x;
> int y = -1;
> x = y;
>
>are much less.

If by "much less" you mean that few current compilers will
warn about that, then that's probably correct.

Put a 'const' on the 'y' and the opposite is probably true.

Presumably this behavior is because the compiler can then
avoid a lot of analysis, which however is required for other
reasons (optimization?) in the case of 'const'.

We will always need better compilers, yes.

Or perhaps higher warning levels... ;-)

Every example of the kind you illustrate above has at least
one counter-example, which was my point. In this case

unsigned f()
{
return std::numeric_limits<unsigned>::max();
// Or something.
}

unsigned y = f();
int x = y;

Instead of 'unsigned' try to imagine 'size_t', and instead of
'f' try to imagine e.g. 'string::length'.

What you illustrate is therefore not a reason to avoid
unsigned (as you argue), but to avoid incorrect signed/
unsigned conversion -- which depends highly on the context,
and which is an issue with the e.g. standard library, which
I for one would not argue that one should avoid (in general).

>I also think that using -1 to set up an unsigned int
>is a perfectly valid use when you need that unsigned
>value

It is. So what?

>but my idea is that the cases where you need
>unsigned values are far less than what seems being
>your opinion.

Many Perl programmers manage well with just strings and
doubles. And automatic error-free conversion of any string
to double, and vice versa. So there's no absolute 'need'.

It is, instead, an engineering and design issue.

>>If the function's contract is to range-check its argument
>>in some way, then, by definition of "contract", it should
>>do so regardless of the physical type of argument.
>>
>>If, on the other hand, its contract doesn't include range
>>checking, then it doesn't need to do any range-checking.
>
>This probably means that you have a different idea of
>what a contract is, and how to write robust code.

To me it's just an out-of-context quote, with a hypothetical
& incorrect conclusion substituted for the stated conclusion.

>...
>>I'm sorry that I fail to see the connection with the above,
>>but I'll try to answer as best I can anyway.
>
>Oh, you don't need to. At least not for me.
>My fighting needs are better satisfied with
>lightning chess and karate.

Oh. I see.

>>Therefore, I must humbly disagree with the above statement.
>
>I'm amazed by the "humbly". What does it mean ?

It indicates a measure of respect, of not being or trying to
be arrogant or prideful, of openness of mind.

See e.g. <url:http://define.ansme.com/words/h/humble.html>.

Hth.,

- Alf

Le Chaud Lapin

ulæst,

1. maj 2003, 18.51.3901.05.2003

til

nobody <clcppm...@this.is.invalid> wrote in message news:<200304301806...@localhost.localdomain>...

> - situations where modular arithmetic is acceptable and/or desirable,
> but again this almost always is where <, <=, >=, and > do not figure
> in, for example in this code fragment posted by John Potter:
>
> % for (vector<T>::size_type i = v.size() - 1; i != -1; -- i)
> % process(v[i]);
> %
> % Still works. Whatever unsigned type is used for size_type, -1 is
> % 0 - 1 and everything is beautiful.
>

This is hardly beautiful. And unless my memory fails me, on a
sign-magnitude machine, neither is it correct.

On 2's-complement machine, -1 has bit pattern 0xFFFFFFF (for example)

On sign-magnitude machine, -1 has 0x80000001

So, on sign-magnitude machine, reduction of the unsigned int 0U by 1
yields bit-pattern 0xFFFFFFFF, which is not equal to 0x80000001.

(Is this right? I never had the luxury of formal training in comp
sci, so I could be wrong. If so, I'm sure you'll be quick to le me
know. :P)

-Chaud Lapin-

Le Chaud Lapin

ulæst,

1. maj 2003, 18.53.3801.05.2003

til

Francis Glassborow <francis.g...@ntlworld.com> wrote in message news:<VGnQlvIO...@robinton.demon.co.uk>...

> When doing bit twiddling I either use an unsigned integer type or a
> bitset. When doing arithmetic I invariably use signed types. When doing
> comparisons I get very annoyed by size_t and various size() returns from
> STL containers being unsigned. They just do not work well with C & C++
> integer promotion rules even though they may seem logically correct for
> the purpose of measuring the size of something.

"They" or "you"? "They" do not do anything until "you" ask them to.

-Chaud Lapin-

Le Chaud Lapin

ulæst,

1. maj 2003, 18.55.3001.05.2003

til

pier...@hotmail.com (Pierre Baillargeon) wrote in message news:<6df0c6a8.03043...@posting.google.com>...

> > If so, there is a
> > price you pay that is far higher than an extra CPU cycle. It is an
> > insidious degradation in the elegance and purity of your system.
> >
> > There is one thing that all the examples given by the pro-signed folks
> > have in common: They illustrate that someone somewhere (an engineer,
> > *not* a user) is doing something they should not have being doing in
> > the first place, and instead of dealing with the problem at its
> > source, you allow a bit of the "evil juice" to flow into your target
> > component.
>
> Yes, it is something called integration. You advocate to design
> components in absolute isolation, ignoring the global view. It will be
> a rare program that will need only unsigned numbers. As soon as you
> need to mix signed with unsigned, you are better off using a single
> signed type.

No, you identify those interfaces where cross-over is about to occur,
and deal with the disparity in type in a thoughtful, deliberate
manner. I think those who studied computer science call this
methodlogy "enforcing the abstraction barrier."

Perhaps the principle at stake in this discussion is:

"What constitutes virtuous systems design?"

Some of us, the pro-unsigned people, are arguing that virtuous systems
design goes far beyond making a component resilient to the belligerent
behaviour of another component. So we are placing less emphasis on
elemental robustness and more on elemental purity.

Yet some of us, the pro-signed people, are saying, "Yada, yada,
purity, smurity, whatever, the thing has to work in the real world
with people who do dumb things inspite their admonition not to do it."
The pro-signed people might go even further and regard the act of
using 'int' instead of 'unsigned int' as moving closer to conceptual
purity.

The critical difference in these perspectives is that the latter
presumes the permissibility of defective design.

Which group are you in?

Maybe if we were not talking about software engineering but some other
engineering, say mechanical, the argument for conceptual purity would
become clearer. In such engineering discplines, you do not have the
luxury to wave your wand and insulate against belligerent components.
The cost is too great.

Let's say you're a mechanical engineer for a car manufacturer, and
your boss catches you designing a coating for one of your components
to protect against battery acid. He might ask,..

B: "What are you doing to that engine belt?"
Y: "I am making it resilient against sulfurc acid."
B: "Why are you doing that?"
Y: "Because the battery keeps leaking on it."
B: "What battery? You mean Fred's battery?"
Y: "Yep."
(boss marches over to Fred's office)
B: "Is it true that your batteries leak?"
F: "Well, not all the time, just ocassionally."
B: "Look Fred, that's not acceptable."
F: "I know. I am adding reinforcement plates to all my batteries."
B: "And why are you doing that?"
F: "Rodney's rod kept poking me when the car vibrated."
B: "What the h*LL!!!"
(boss marches over to Rodney's office)
R: "Hi Boss."
B: "Are your chassis rods breaking Fred's batteries?"
R: "Yeah, I am ordering clamps right now to hold them down."
B: "Why are you poking Fred's batteries?"
R: "It's not me Boss, it's Tony. It's those new shock absorbers."
(boss marches over to Tony's office)
B: "Tony, just so you know, I am at 212 Fahrenheit!"
T: "What Boss, what's matter?"
B: "Didn't I say 'no' to you last week to to those Super-X shock
absorbers?"
T: (pause) - "Well, yeah, but..."
B: "But what..."
T: "Its' not really a problem anymore."
B: "Oh it isn't is it??"
T: "Nope, it's fixed. Rodney is putting rubber boots on his rods."

This skit illustrates the insidious effect of not containing a problem
at its source. However, a primmary difference between software
engineering and mechanical engineering is material cost. The
pro-signed people could argue that the incremental material cost of
making their system "robust" by using signed and throwing and
exception/etc. is negligible, and therefore acceptable.

But what would happen if the incremental cost of rubber boots, metal
plates, and acid buffers were completely zero? Furthermore, what
would happen if the requisite mental energy for making these
automobile components more robust were on par with that for
catching/throwing exception? Do you think car manufacturers would
want to design automobiles that way? Of course not.

Good engineers, as best they can, focus on making their components
pure, not intrepid. And when they have a choice between purity and
intrepidity, they choose purity, knowing that it is purity that best
leads to overall "system" virtue.

It is a mistake to spend an inordinate amount of time designing
components to protect themselves from the belligerent behaviour of
sibling components. Doing so can lead to chaos and overal system
uncertainty. When you are done, you might have something that works,
but you cannot really attest to the overall virtue of the system
because you made the fateful fundamental presumption of
defectively-designed components. This mode of thought can have
catastrophic consequences. Martian landers, space shuttles, Concorde
jets, bridges, dams, and nuclear reactors are all examples where
someone knew how to get it right, but another more guided by practice
than principle was content with the fact that it worked well-enough.

Duct tape is not the answer. Getting it right the answer.

If you get it wrong the first time, you might still have to get it
right later (unless it explodes). So you might as well get it right
the first time, insist that others do the same, and if they don't, and
some surely won't, deal with the imperfection at that juncture where
it manifests.

-Chaud Lapin-

Le Chaud Lapin

ulæst,

1. maj 2003, 18.58.1001.05.2003

til

LLeweLLyn <llewe...@xmission.dot.com> wrote in message news:<m1bryn3...@localhost.localdomain>...

> unorigina...@yahoo.com (Le Chaud Lapin) writes:
>
> Yes. Just as the empirical techniques pioneered by Bacon, Newton, and
> others take precedence over the logic techniques demonstrated by
> Aristotle.

Not to get too far off topic, but Aristotle did considerable empirical
work (in biology for example). His work is less remembered because
it was not as sensational as say, a Tycho Brahe sextant (in the visual
sense), and given that he knew he had less than 100 years to figure
out the universe, he did the best he could by offering a mixture of
practical results with speculative reasoning. I am sure that if there
had been a Keck II telescope available in his time, he might have
hopped on a 747 and flown to Hawaii to use it. But did he get his
hands dirty? Certainly.

Conversely, both Bacon and Newton, as well as most other great
scientific thinkers, were also men of principle. I would go even
further to argue that it was principle, and not practice, that
permitted them to see what their peers could not and put them on the
path of revelation. You do not "discover" things like gravity through
measurement. You verify a hypothesis of the existence of gravity
through measurement. What separated these great men from the rest of
us was not practice (though practical embodiment could have the added
effect to making their assertions nearly uncontestable - nuclear
bomb). It was their ability to siphon the intangible truth from an
intellectual void, then present it to the world in the form of a
tangible model.

-Chaud Lapin-

Jim Melton

ulæst,

2. maj 2003, 05.49.1402.05.2003

til

"Francis Glassborow" <francis.g...@ntlworld.com> wrote in message
news:VGnQlvIO...@robinton.demon.co.uk...

> In message <m1fznz3...@localhost.localdomain>, LLeweLLyn
> <llewe...@xmission.dot.com> writes
> >Do they do arithmetic with either of these types? I had a CS professor
> > who assumed everyone knew degrees were modulo 360 in a test
> > problem. In a class of 175, 6 people (myself included) got the
> > answer right. I think most programmers forgot all about math on
> > angles the moment they left trigonometry or calculus class.
>
> Then I think that actually the six of you got it wrong :-) A wheel that
> rotates 720 degrees may be very different from one that has rotated
> through 0 (think of your wheel being one of the cogs in an old watch, or
> a pulley on a crane. I seem to remember that Quantum Physics also has
> some surprisingly different views of rotation.
>
> Actually this is a good example of what this thread is about, underlying
> assumptions about the problem domain.

Exactly! You have added constraints that were not in the referenced
paragraph. The angular measurement of 720 degrees is exactly the same as the
angular measurement of 0 degrees. However, if you introduce side effects of
rotation (as in your watch cog or crane pulley) then although the
orientation of the wheel is the same in both cases, the state of the
"system" is not.

Unfortunately, I've lost what this had to do with unsigned vs signed.

(FWIW, I use an AngularValue type to represent degrees of rotation, and the
modulo is performed implicitly when the internal representation is modified.
Is the internal representation signed or unsigned? What difference does it
make?!! Internal representation is none of your business :-)
--
<disclaimer>
Opinions posted are those of the author.
My company doesn't pay me enough to speak for them.
</disclaimer>
--
Jim Melton
Software Architect, Fusion Programs
Lockheed Martin Astronautics
(303) 971-3846

Andrea Griffini

ulæst,

2. maj 2003, 09.15.3602.05.2003

til

On 1 May 2003 18:20:16 -0400, al...@start.no (Alf P. Steinbach) wrote:

>On 1 May 2003 11:02:00 -0400, Andrea Griffini <agr...@tin.it> wrote:
>
>>On 29 Apr 2003 05:29:36 -0400, al...@start.no (Alf P. Steinbach)
>>wrote:
>>
>>> size = -1; // unsigned => warning, signed => no warning.
>>
>>I personally find this a very little accomplishment.
>
>Could you clarify that, please? (I'm stumped.)

I'm not going to open a good wine bottle just because I
found a compiler that informs me I'm assigning a negative
constant value to an unsigned variable.
In my experience that is surely NOT the major source of
problems of unsigned/signed conversions.

>If by "much less" you mean that few current compilers will
>warn about that, then that's probably correct.
>
>Put a 'const' on the 'y' and the opposite is probably true.
>
>Presumably this behavior is because the compiler can then
>avoid a lot of analysis, which however is required for other
>reasons (optimization?) in the case of 'const'.

I think the reason is another... the reason is IMO
that in the C++ language unsigned integers are NOT
integers that can't hold negatives, but "modulo" integers.
Mixing them with integers isn't such an issue if you
just do that when you need that "modulo" property
(a interesting but not so common situation).
The problem start when one insists on using them as
non-negative integers... and somewhat the standard
library did this in a few places for no good reason.

>We will always need better compilers, yes.
>
>Or perhaps higher warning levels... ;-)

I think that a warning level informing of all implicit
conversions from signed to unsigned would be completely
useless. My impression is that with current (mis)use
of unsigned types by the standard library a C++ program
with all explict casts and "U" after literals wouldn't
be better under any aspect.

>Every example of the kind you illustrate above has at least
>one counter-example, which was my point. In this case
>
> unsigned f()
> {
> return std::numeric_limits<unsigned>::max();
> // Or something.
> }
>
> unsigned y = f();
> int x = y;
>
>
>Instead of 'unsigned' try to imagine 'size_t', and instead of
>'f' try to imagine e.g. 'string::length'.

Like I've been trying to say in my bad english sure
you can get bad behaviour of signed integers too.
You even get undefined behaviour... but that happens
FAR from normal uses, while the strange behaviour
of unsigned types is just one unit distant from
the probably most used number in programming (I mean
strange if you think to unsigned as integers that
can't become negative, and not as "modulo" integers).

>What you illustrate is therefore not a reason to avoid
>unsigned (as you argue), but to avoid incorrect signed/
>unsigned conversion -- which depends highly on the context,
>and which is an issue with the e.g. standard library, which
>I for one would not argue that one should avoid (in general).

Unsigned numbers are lovely when I need the property
that characterize them... i.e. that they behave as
elements of Z_{2^N}. They also have another property,
that is a range extended by just a bit, but this is,
excluding very specific cases, not worth the hassle
in my opinion.

IMO using them as parameters so you don't have to do
checks for negative values is nonsense. IMO using them
as return values that are integers that just happens
never being negative is nonsense. IMO using them to
gain the extra range is not worth the cost in bugs
this will imply in a language like C++.

>>I also think that using -1 to set up an unsigned int
>>is a perfectly valid use when you need that unsigned
>>value
>
>It is. So what?

So the warning could be annoying! Actually I think that
the percentage of programs in which it would be helpful
is smaller than the percentage in which it would just
be a false positive (i.e. the programmer indeed wanted
to assign that number).

>>but my idea is that the cases where you need
>>unsigned values are far less than what seems being
>>your opinion.
>
>Many Perl programmers manage well with just strings and
>doubles. And automatic error-free conversion of any string
>to double, and vice versa. So there's no absolute 'need'.
>
>It is, instead, an engineering and design issue.

Exactly. In my pratical experience unsigned (mis)use
in the standard library is mostly a source of problems.
Reading this thread is conforting to see I'm not the only
one in this position.

>To me it's just an out-of-context quote, with a hypothetical
>& incorrect conclusion substituted for the stated conclusion.

I don't understand what you mean, and I've a light suspect
this was just the reason for that phrase.

>>>Therefore, I must humbly disagree with the above statement.
>>
>>I'm amazed by the "humbly". What does it mean ?
>
>It indicates a measure of respect, of not being or trying to
>be arrogant or prideful, of openness of mind.
>
>See e.g. <url:http://define.ansme.com/words/h/humble.html>.

Thank you... but I already knew the formal meaning of the
word. But words don't tell everything... it's quite common
to have them meaning the exact opposite of what one can
find in a dictionary. Not hearing your voice or seeing your
eyes while you tell "humbly" I really can't tell.

I even think I've read somewhere that using IMO is better
than using IMHO, just to avoid people being confused by
what the H *really* means.

Andrea

John Potter

ulæst,

2. maj 2003, 09.19.0602.05.2003

til

On 1 May 2003 18:51:39 -0400, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

> nobody <clcppm...@this.is.invalid> wrote in message news:<200304301806...@localhost.localdomain>...
> > - situations where modular arithmetic is acceptable and/or desirable,
> > but again this almost always is where <, <=, >=, and > do not figure
> > in, for example in this code fragment posted by John Potter:

> > % for (vector<T>::size_type i = v.size() - 1; i != -1; -- i)
> > % process(v[i]);

> > % Still works. Whatever unsigned type is used for size_type, -1 is
> > % 0 - 1 and everything is beautiful.

> This is hardly beautiful.

Beauty is in the eye of the beholder? Let's say wonderful. :)

> And unless my memory fails me, on a
> sign-magnitude machine, neither is it correct.

> On 2's-complement machine, -1 has bit pattern 0xFFFFFFF (for example)

> On sign-magnitude machine, -1 has 0x80000001

> So, on sign-magnitude machine, reduction of the unsigned int 0U by 1
> yields bit-pattern 0xFFFFFFFF, which is not equal to 0x80000001.

On a standard conforming C++ compiler, unsigned(-1) is the largest
unsigned by definition. The bit pattern change is the compiler writer's
problem. It happens naturally for s2c and must be forced for s1c and sm.

There is an assumption in the above that vector<T>::size_type is at least
unsigned int. To make it work for a silly implementation using unsigned
char or short(smaller than int) it would require an explicit cast.

for (vector<T>::size_type i = v.size() - 1;

i != vector<T>::size_type(-1); -- i)
process(v[i]);

John

nobody

ulæst,

2. maj 2003, 09.23.0502.05.2003

til

Le Chaud Lapin wrote:
>[nobody wrote:]

> > - situations where modular arithmetic is acceptable and/or desirable,
> > but again this almost always is where <, <=, >=, and > do not figure
> > in, for example in this code fragment posted by John Potter:
> >
> > % for (vector<T>::size_type i = v.size() - 1; i != -1; -- i)
> > % process(v[i]);
> > %
> > % Still works. Whatever unsigned type is used for size_type, -1 is
> > % 0 - 1 and everything is beautiful.
>
> This is hardly beautiful. And unless my memory fails me, on a
> sign-magnitude machine, neither is it correct.

Beauty in in the eye of the beholder, but correctness follows from
the rules in the standard.

> On 2's-complement machine, -1 has bit pattern 0xFFFFFFF (for example)

That's true.

> On sign-magnitude machine, -1 has 0x80000001

That's also true.

> So, on sign-magnitude machine, reduction of the unsigned int 0U by 1
> yields bit-pattern 0xFFFFFFFF, which is not equal to 0x80000001.

No. Irrespective of what representation is used, the standard requires
that the integer value -1, when converted to unsigned int by an integral
promotion, turn into 2^n - 1, where n is the number of bits in an
unsigned int. For n=32 that's 0xFFFFFFFF.

The citations from the standard to back this up are:

3.9.1 Fundamental types

4 Unsigned integers, declared unsigned, shall obey the laws of
arithmetic modulo 2^n where n is the number of bits in the value
representation of that particular size of integer

and:

4.7 Integral conversions

2 If the destination type is unsigned, the resulting value is the least
unsigned integer congruent to the source integer (modulo 2^n where n
is the number of bits used to represent the unsigned type). [ Note:
In a two's complement representation, this conversion is conceptual
and there is no change in the bit pattern (if there is no truncation). ]

Note that a signed-to-unsigned conversion DOES entail a change to the bit
pattern in a sign-magnitude representation when the source integer is
negative.

nobody

James Kanze

ulæst,

2. maj 2003, 09.29.5502.05.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) wrote in message
news:<fc2e0ade.03050...@posting.google.com>...

> nobody <clcppm...@this.is.invalid> wrote in message
> news:<200304301806...@localhost.localdomain>...

> > - situations where modular arithmetic is acceptable and/or
> > desirable, but again this almost always is where <, <=, >=, and > do
> > not figure in, for example in this code fragment posted by John
> > Potter:

> > % for (vector<T>::size_type i = v.size() - 1; i != -1; -- i)
> > % process(v[i]);

> > % Still works. Whatever unsigned type is used for size_type, -1 is %
> > 0 - 1 and everything is beautiful.

> This is hardly beautiful. And unless my memory fails me, on a
> sign-magnitude machine, neither is it correct.

The standard guarantees that it will work. Conversion from signed to
unsigned involves modulo arithmetic.

> On 2's-complement machine, -1 has bit pattern 0xFFFFFFF (for example)

> On sign-magnitude machine, -1 has 0x80000001

True, but irrelevant.

> So, on sign-magnitude machine, reduction of the unsigned int 0U by 1
> yields bit-pattern 0xFFFFFFFF, which is not equal to 0x80000001.

The standard fully defines the conversions from signed to unsigned (but
not the other way round) in terms of modulo arithmetic. It is the
compiler's problem to make it work on the given hardware. On 2's
complement machines, making it work is trivial. On 1's complement and
signed magnitude, generally, the compiler has to generate a little bit
of extra code to get the right results. Whatever the machine, however,
the results of converting a negative number n to an unsigned type T are
defined as the mathematical results of taking
std::numeric_limits<T>.max() + 1, and then adding that value to n as
many times as necessary for the results to be in range for T.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr

Conseils en informatique oriente objet/
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, Tl. : +33 (0)1 30 23 45 16

Gavin Deane

ulæst,

2. maj 2003, 11.44.5302.05.2003

til

Of course yes.

If I am a car manufacturer and I can build robust cars or unrobust
cars, or I have the opportunity to get the new Super-X shock absorbers
that I want into the design, and the incremental cost of making robust
ones is that negligible, then of course I will build robust ones. With
the best quality control in the world, some batteries will be of
inferior quality and will leak a bit. As it gets older, any battery
will be more likely to leak acid onto other components. Why would I
not want my cars to be resilient to that? [*] It's a completely
different sort of maintenance, but the analogy fits with software
maintenance. Robust software components that don't assume they are
surrounded by perfection will make the development of the next
generation of the product much easier and less error prone.

GJD

[*] There is a reason of course. If I make cars that only last five
years, you have to buy another one from me more often, but that is
beside the point of my argument. And way off the topic of good
engineering practice.

Dave Harris

ulæst,

2. maj 2003, 11.58.1902.05.2003

til

jpo...@falcon.lhup.edu (John Potter) wrote (abridged):
> Oops, I just learned that subscripts should use size_type.

>
> for (vector<T>::size_type i = v.size() - 1; i != -1; -- i)

> process(v[i]);

>
> Still works. Whatever unsigned type is used for size_type, -1 is

> 0 - 1 and everything is beautiful.

Really? Suppose size_type is unsigned 16-bit and int is signed 32-bit in
the usual 2s-compliment representation. Then as I understand it,
size_type(-1) will be promoted to 65535, which is not equal either to -1
or to unsigned(-1) == 4294967295.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Thomas Richter

ulæst,

2. maj 2003, 12.07.1902.05.2003

til

Hi,

> I'm not going to open a good wine bottle just because I
> found a compiler that informs me I'm assigning a negative
> constant value to an unsigned variable.
> In my experience that is surely NOT the major source of
> problems of unsigned/signed conversions.

True enough. These mistakes are obvious.

> I think the reason is another... the reason is IMO
> that in the C++ language unsigned integers are NOT
> integers that can't hold negatives, but "modulo" integers.

Now, that is true for "signed int" as well. If I define
variable "int x" the largest representable integer and add
one, I get a negative number. This is clearly not true for
the group (Z,+) (integer numbers with addition).

> The problem start when one insists on using them as
> non-negative integers... and somewhat the standard
> library did this in a few places for no good reason.

I think using them for non-negative integers is still
a sensible thing to do, provided you keep in mind that
you're missing negatives. (non-negatives mod N still
form a group, though the inverses are less intuitive).

This causes a number of caveats for sure, but I still think
functions like "strlen" should return something unsigned
because it is very natural. There is no string whose length
is negative. In less obvious cases, you're asking for
trouble then as soon as you need to convert it to something
signed (because the range is smaller). Well, this is
overly picky and rarely a problem, of course.

> I think that a warning level informing of all implicit
> conversions from signed to unsigned would be completely
> useless.

I don't agree here. If I write a cast to the apropriate
signed type explicitly, then that's a note to the compiler
as "Yes Sir, I thought about what I'm doing here, it's
really fine", whereas otherwise I could have just overlooked
what's going on.

> My impression is that with current (mis)use
> of unsigned types by the standard library a C++ program
> with all explict casts and "U" after literals wouldn't
> be better under any aspect.

As said, I won't agree that this is a misuse. It is just
natural for some functions and appeals intuition. The
question is: When do we enter problems and a possible
"surprise factor". We do in two cases:
i) When blindly (=implicitly) casting to a signed
type, but the unsigned type is out of range for the
signed type.
ii) When taking differences containing unsigned types
resulting in unsigned's, expecting results as if we had (Z,+).

For i) a compiler warning for an implicit conversion
won't cover all cases, but at least some. In case
a library function contains a result that is too
large for a signed type, what can one do? A string
*could* be longer, in theory, than the size of a signed
int in a 32 bit address space (though highly unlikely... ;-)

Hence, the library does, indeed, return proper results in
all valid input cases, and would have had a problem for
signed in extremly rare situations, maybe.

For ii), a warning would be helpful as well to emmit
a warning ("Difference could be signed"), but then simple
constructions like "u--" could result in warnings where
obviously nothing is wrong.

> >What you illustrate is therefore not a reason to avoid
> >unsigned (as you argue), but to avoid incorrect signed/
> >unsigned conversion -- which depends highly on the context,
> >and which is an issue with the e.g. standard library, which
> >I for one would not argue that one should avoid (in general).

> Unsigned numbers are lovely when I need the property
> that characterize them... i.e. that they behave as
> elements of Z_{2^N}.

So do signed int`s. After all, we only have a finite number
of bits. (-; Though I'm aware what you're saying, sure.

I'm not willing to say that "unsigned should be banned".
It should be used carefully, especially by the beginner, by
exactly this possible "surprise factor". On the other hand,
unsigned's have something to offer as well, namely the
reminder to the programmer that some value is/should be
definitely non-negative - which can be a helpful hint. Instead of
seeing the world black and white, I'd rather would make
this dependent on context. In some cases, this additional
value is not worth the price ("surprise factor"). In other's,
it is. Depends of course on the programmer as well. I feel
perfectly happy with unsigned's, and I'm aware of the
limitations.

> They also have another property,
> that is a range extended by just a bit, but this is,
> excluding very specific cases, not worth the hassle
> in my opinion.

Rarely. That's also not the point.

> IMO using them as parameters so you don't have to do
> checks for negative values is nonsense. IMO using them
> as return values that are integers that just happens
> never being negative is nonsense.

On that I don't agree. It is sometimes very useful to know
that a result is non-negative.

> IMO using them to
> gain the extra range is not worth the cost in bugs
> this will imply in a language like C++.

On that I agree. I happens rarely that one needs the
extra bits. (unsigned char being one exception as a natural
representation for ISO-LATIN characters)

> >>I also think that using -1 to set up an unsigned int
> >>is a perfectly valid use when you need that unsigned
> >>value
> >
> >It is. So what?

> So the warning could be annoying!

Well, I don't think so. A typical example where it is
common to assign -1 to an unsigned is to setup a bit mask
with all bits set. If I need an 0xfff... mask as
unsigned, I shouldn't write -1 for it, though the result
is really the same. I should rather write ~0, which
represents the idea better: Flip all bits of the zero.

In that case, a warning would be really anoying, though
not in the -1 case.

> Actually I think that
> the percentage of programs in which it would be helpful
> is smaller than the percentage in which it would just
> be a false positive (i.e. the programmer indeed wanted
> to assign that number).

This is very hard to argue about. It is a matter of
the programming style of the user. After all, a warning
is only a warning. If you always know what you're doing,
turn it off. Problem solved. I would occasionally be
happy with it.

So (unsigned) long,
Thomas

P.S. I rather happen to notice that I signed this post.
Hence, it is signed, not unsigned.

P.P.S Sorry for the pun. Couldn't stop me.

Dave Harris

ulæst,

2. maj 2003, 12.11.0002.05.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) wrote (abridged):

> Duct tape is not the answer. Getting it right the answer.

I agree with you on this. I disagree about what "right" is. I don't think
it is right to use unsigned int as if it were a Pascal-like subrange of
int, because it doesn't have the correct semantics for that. It might be
nice if C++ had real subranges, but it doesn't.

(I don't agree with the person who said that C++ got this wrong; C++ is
just missing a potential feature which Pascal has. C++ is only actively
wrong if you try for force unsigned int into the Pascal subrange role.)

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Alf P. Steinbach

ulæst,

2. maj 2003, 20.55.4502.05.2003

til

On 2 May 2003 09:15:36 -0400, Andrea Griffini <agr...@tin.it> wrote:
>...

>The problem start when one insists on using them as
>non-negative integers... and somewhat the standard
>library did this in a few places for no good reason.

I'd rather like to think there are good reasons for most
everything in the standard library.

>...

>Like I've been trying to say in my bad english sure
>you can get bad behaviour of signed integers too.
>You even get undefined behaviour... but that happens
>FAR from normal uses,

IMO that's a very false sense of security. For a typical C++
compiler modulo arithmetic is used in both cases. Pretending
that it isn't, on the grounds that it _often_ works, makes for
very hard to find bugs in the cases where it doesn't work.

>...
>IMO using [unsigned] as parameters so you don't have to do

>checks for negative values is nonsense.

This has already been discussed, but in your answer to that
all but the initial introductory comments were elided, which
together with the above leads me to think that a recap might
be useful.

Let's try to do the dance again then.

Given a function to design: first choose a contract for the
function, then choose types that supports the contract. To
make that concrete, say the function is the one exemplified
earlier, 'alloc', with one 'size' argument. Here is a possible
contract extract, about the range of the 'size' argument:

A) 'size' must be in the range 0 through INT_MAX.

In that case 'int' supports range-checking, but conveys a
false impression that negative values are allowed. 'unsigned'
likewise supports range-checking (you can catch the exact same
client code bugs as with 'int'), and is a far more honest
about the allowed range. 'long' conveys nothing about the
allowed range but supports range-checking. 'unsigned long' or
'size_t' conveys the non-negative requirement and supports
range-checking, but may give the false impression of a much
higher upper limit than the actual one. A user-defined type
may make the range explicit but at the cost of efficiency and
introducing extra file relationships.

In summary, 'int' and 'unsigned' have in this case the same
technical advantage (range-checking), but 'int' is more of a
lie than 'unsigned' and so will most probably invite more client
code bugs, as well as require more reading of documentation --
I happen to think that what can be expressed in code instead of
in comments and/or documentation should be expressed in code.

Here is another possibility for the range-part of the contract:

B) 'size' must be in the range 0 through UINT_MAX.

In this case, 'int' is both very misleading and not universally
portable (although it can be made to work with most compilers),
and doesn't support range-checking. 'unsigned', on the other hand,
is in this case an exact match for the contract, but it doesn't
support range-checking. 'long' is a lie, and so an invitation to
bugs, but it does support range-checking. 'unsigned long' is less
of a lie, while still supporting range-checking.

If the rest of the contract doesn't require range-checking, then
it is legitimate to consider 'int' and 'unsigned'; choosing
between them is, to me, a no-brainer, given the pros and cons
listed above.

If, on the other hand, the rest of the contract does require range-
checking, then choosing between 'long' and 'unsigned long' is
likewise a no-brainer to me.

Why do you choose whatever you choose for case B?

Which alternative 'alloc' contract C do you have in mind where
'int' is for some reason a better choice than 'unsigned', or
'long' is for some reason a better choice than 'unsigned long',
and most important, why would signed be better for contract C?

- Alf

nobody

ulæst,

3. maj 2003, 06.12.5303.05.2003

til

Dave Harris wrote:
> unorigina...@yahoo.com (Le Chaud Lapin) wrote (abridged):
> > Duct tape is not the answer. Getting it right the answer.
>
> I agree with you on this. I disagree about what "right" is. I don't think
> it is right to use unsigned int as if it were a Pascal-like subrange of
> int, because it doesn't have the correct semantics for that. It might be
> nice if C++ had real subranges, but it doesn't.

I fully agree with this. To paraphrase from the Boost documentation,
the biggest problem with unsigned is that programmers use it as if
it behaved they way they would like it to (i.e., as a Pascal-like
subrange of int) instead of the way it actually does.

> (I don't agree with the person who said that C++ got this wrong; C++ is
> just missing a potential feature which Pascal has. C++ is only actively
> wrong if you try for force unsigned int into the Pascal subrange role.)

The reason I said that is because the C/C++ integral promotion rules
make it all too easy to abuse unsigned int as if it were a Pascal-like
integer subrange. Excessively liberal implicit conversions are my
main complaint, not the presence of unsigned int per se.

nobody

Pierre Baillargeon

ulæst,

3. maj 2003, 06.18.1303.05.2003

til

I'd like to give you some debating advices. Once you've been through a
few meetings and debates, you learn to have automatic alarms about
some suspect behaviors. Here are a few points that will help me take
your arguments more seriously:

- Stop using loaded words to describes sides. Stop labeling positions
as "virtuous" "pure", "intrepid", "belligerent", "defective design",
"insidious degradation", "evil juice", "Good engineers", "Duct tape",
etc. (and that is only in your last post!).

- Stop insulting people who disgree with you. You may think that your
insults are subtles and will therefore slip by, but it just degrades
your argument. For example, in your last post, you label those who
disagree with you as: not "good engineers", "those who [did not]
studied computer science", disobeying one's boss orders, suggesting
that it's people like pro-signed that caused catastrophes.

- Using loaded examples and incorrect metaphors. An example of the
latter is your appeal to big catastrophes. Unfortunetly for your
argument, those catastrophes were not caused by errors due to bad
interfacing but by catastrophic failures or usages beyond limits. If
anything, they argue for resilient systems with graceful degradations,
not design in isolation vs. design as a whole.

- Following an argument, but suddenly reaching an opposite conclusion.
For example, you talked about catastrophes caused by improper
validation of interfaces between two modules. But using unsigned
integer *increases* the number of interface points, thus the
likelyhood of having one wrong. Yet you use this as proof that using
unsigned int are better!

unorigina...@yahoo.com (Le Chaud Lapin) wrote in message news:<fc2e0ade.03050...@posting.google.com>...
>

> No, you identify those interfaces where cross-over is about to occur,
> and deal with the disparity in type in a thoughtful, deliberate
> manner. I think those who studied computer science call this
> methodlogy "enforcing the abstraction barrier."

Nobody is arguing about overlooking "interface cross-over" points. The
point was that reducing the number of such interface reduces the
likelihood of errors, and reduces the code size and complexity at the
same time. Those gains are directly derived from the reduced number of
execution paths since there are less check points. BTW, we're
toughtful, deliberate and have made studies as well.

> Perhaps the principle at stake in this discussion is:
>
> "What constitutes virtuous systems design?"

[...]

Those paragraphs add nothing to the debate. They are just labeling
sides as virtuous, pure, permissive to defective design (*scoff*) etc.
Which have no objective value. There is no such thing as an absolute
measure of purity or virtuousness, otherwise we would not have this
debate.

For example, what you call purity is for me just buying a little bit
of range to the valid value a variable can take. An unsigned int can
represent values from 0 to UINT_MAX, while a signed int can represent
values from 0 to INT_MAX. In both cases, there are values that will
yield incorrect representation in the variable. Using unsigned only
doubles the range, which is, in my view, a very small gain for the
cost. If the range gain were of the order of 1000 or more, I would be
more incline to consider the cost outweighted. For example, I see
right-away that using 32-bits ints is much more useful than 16-bits
ints.

>
> Maybe if we were not talking about software engineering but some other
> engineering, say mechanical,

[loaded example deleted]

[cost rethorical questions deleted]

It would be more useful if you spelt out what the costs in question
are and how using unsigned int eliminates them.

[how good engineer go from purity to virtuousness deleted]

[remark about using signed int is being belligerent deleted]

[how using signed ints is beliigerent behavior which lead to
non-virtuousness deleted]

Again, what if you spelt out the cost, benefits or any other
measurable or objective measure instead of using the undefined and
empty "virtuous" label?

Also, how does using signed int make a system belligerent, chaotic,
uncertain, defective? These are all thrown about without advancing a
single supportive claim.

Le Chaud Lapin

ulæst,

3. maj 2003, 06.19.4603.05.2003

til

nobody <clcppm...@this.is.invalid> wrote in message news:<200305020334...@localhost.localdomain>...

I do not understand. Assuming a 32-bit sign-magnitude environment,

What is the bit pattern for -1? Is it (for example), 0xFFFFFFFF or 0x80000001?

What is the bit pattern of 's' after executing the following code?

signed int r = -1;
signed int s = r; // What is the bit pattern for s?

What is the bit pattern of 'u' after executing the following code?

unsigned int u = 0U;
--u; // What is the new bit pattern of u?

-Chaud Lapin-

LLeweLLyn

ulæst,

3. maj 2003, 06.23.2303.05.2003

til

Andrea Griffini <agr...@tin.it> writes:
[snip]

> I think that a warning level informing of all implicit
> conversions from signed to unsigned would be completely
> useless. My impression is that with current (mis)use
> of unsigned types by the standard library a C++ program
> with all explict casts and "U" after literals wouldn't
> be better under any aspect.

[snip]

gcc has -Wsign-compare, which warns on comparison between signed and
unsigned types. I've seen two 100KLOC size projects attempt to
enable this flag for all compiles, and abandon the attempt due to
excess warnings. Note, this flag doesn't warn for all conversions
- just those that might change the result of a comparison.

I think it is potentially useful for projects that start out using it,
but not for adding to most pre-existing code bases.

Vladimir Kouznetsov

ulæst,

3. maj 2003, 06.31.1503.05.2003

til

> So, as far as I am concerned, if something must be non-negative, make
> it unsigned. Container sizes, screen coordinates, person's age or
> height, etc.

For the record - I don't agree with that pro-signed reasoning. However
it's quite convenient to use int for screen coordinates and I even can
imagine applications where person's age and container size could be
negative! Well, normally they are not...

thanks,
v

Andrea Griffini

ulæst,

3. maj 2003, 06.31.4303.05.2003

til

On 2 May 2003 12:07:19 -0400, Thomas Richter
<th...@cleopatra.math.tu-berlin.de> wrote:

>Now, that is true for "signed int" as well. If I define
>variable "int x" the largest representable integer and add
>one, I get a negative number.

No. You formally get undefined behaviour. Ok... probably
95% of environments out there just give you a negative
number; but that's not guaranteed.

>This causes a number of caveats for sure, but I still think
>functions like "strlen" should return something unsigned
>because it is very natural.

It's not natural because unsigned are not integers...
May be you find natural that computing

double x = strlen(s) - 1;

gives you a huge positive number with an empty string ?
Do you find natural that if I do

for (int i=0; i<strlen(s)-1; i++)
...

then the loop is going to take forever and access outside
the string if the string is empty ?

The problem is that for them being consider "integers
that can't be negative" then it's nonsense to have
the difference of two unsigned being unsigned, nor the
opposite of an unsigned being an unsigned, nor it
doesn't make sense that any unsigned int compare as
less than or equal to -1.

>There is no string whose length is negative.

So ? Still I don't think it's obvious or even useful
thinking to the lenght of a string as a member of
Z_{2^N} (because THAT is what unsigned ints are in C++).
May be there are cases in which this is what I want, but
IMO those are the singularities, not the general case.

>In less obvious cases, you're asking for
>trouble then as soon as you need to convert it to something
>signed (because the range is smaller). Well, this is
>overly picky and rarely a problem, of course.

Unsigned ints normally provide a more uniform access
to all the values. There are cases in which this is
just perfect. Once again I don't think this is the
common case; like I said before my opinion is that if
2^(n-1) isn't enough for you then normally quite soon
2^n won't be enough either and you should look for a
better solution. It's just the double, after all; that's
not going to change your life.

>> I think that a warning level informing of all implicit
>> conversions from signed to unsigned would be completely
>> useless.
>
>I don't agree here. If I write a cast to the apropriate
>signed type explicitly, then that's a note to the compiler
>as "Yes Sir, I thought about what I'm doing here, it's
>really fine", whereas otherwise I could have just overlooked
>what's going on.

I think that having to cast all indicies to std::vector
elements could be quite annoying. Especially if we consider
that type cast has been made ugly on intention (according
to the literature this is a feature... but I don't agree
on this; it's like saying that would be better to have
asm directives requiring just binary values instead of
mnemonics, because you shouldn't use them a lot anyway).

>> My impression is that with current (mis)use
>> of unsigned types by the standard library a C++ program
>> with all explict casts and "U" after literals wouldn't
>> be better under any aspect.
>
>As said, I won't agree that this is a misuse.
>It is just natural for some functions and appeals intuition.

IMO this happens because you're fooled by the name "unsigned"
that doesn't reflect the way those entities behave in C++.
Try calling them "members of Z_{2^N}" and think how intuitive
their use in the standard library is.

>The question is: When do we enter problems and a possible
>"surprise factor". We do in two cases:
>i) When blindly (=implicitly) casting to a signed
>type, but the unsigned type is out of range for the
>signed type.

That's undefined behaviour. Probably your programs should
try to stay away from those limits.

>ii) When taking differences containing unsigned types
>resulting in unsigned's, expecting results as if we had (Z,+).

What do you think is more natural as the result of
taking the lenght of an empty string and subtracting one ?
minus one or four billions and something ?

>For i) a compiler warning for an implicit conversion
>won't cover all cases, but at least some. In case
>a library function contains a result that is too
>large for a signed type, what can one do? A string
>*could* be longer, in theory, than the size of a signed
>int in a 32 bit address space (though highly unlikely... ;-)

Exactly. And if you're in a 16 bit environment that would
be unlikely too; often that is just half of your addressable
space, and having more than half of it containing just one
string doesn't sound to me that common.
When it's not half of your addressable space (say in a paged
environment) then still if you may happen to need handling
strings of more than 32K then I wouldn't be surprised that
64K is also an artificial limit. IMO this indeed means you
need a better data structure, not just a single bit more.

>Hence, the library does, indeed, return proper results in
>all valid input cases, and would have had a problem for
>signed in extremly rare situations, maybe.

I think it doesn't return the proper result. The numeric
value may be ok, but the type is wrong. The size of a
vector hardly has anything to do with Z_{2^N} elements.

>I'm not willing to say that "unsigned should be banned".

Neither do I. I think they're an important part of C++.
Not having them for bit fiddling would be a major loss,
and this would mean opening a space between asm and C++.
Not leaving such a space I think was one of the results
C++ was aiming to.

>On the other hand, unsigned's have something to offer
>as well, namely the reminder to the programmer that
>some value is/should be definitely non-negative - which
>can be a helpful hint.

I don't see that as really important. But may be it's me.
I even have doubts about const correctness in this respect.

>Depends of course on the programmer as well. I feel
>perfectly happy with unsigned's, and I'm aware of the
>limitations.

I've been hit a couple of times only, but it hurts.
And I've found myself explaining to other programmers
why their code was wrong in quite a few occasions and,
believe me, the C++ position is hardly defendible.
Documenting a bug doesn't change it to a feature; if
most programmers get this wrong then IMO it's the
language that is broken.

Like I said before this of course assuming that
you're writing programs for a reason, and that the
working program is what you're seeking.
If your needs are just blah blah talking about what
is formally defined by the language then it's all
different; my impression is that this thread was not
about that kind of talking, but about pratical
implications.

>(unsigned char being one exception as a natural
>representation for ISO-LATIN characters)

Unsigned chars are indeed an exception, because they
actually (normally) BEHAVE like integers. The reason
is because of how those strange conversion rules work
and because char is just a storage type like float;
this imply that on most platforms

unsigned char x = 2;
std::cout << x - 3;

outputs -1. But if you replace the "char" keyword with
"int" then the output will be a huge strange number.
Try also this

unsigned char x = 4;
std::cout << -x;

and see what the output is.

My guess is that having the rules forcing the unsigned-ness
of the result even in these cases would break *a lot* of
code out there.

>P.P.S Sorry for the pun. Couldn't stop me.

:)

Andrea

Dave Harris

ulæst,

3. maj 2003, 06.32.0303.05.2003

til

th...@cleopatra.math.tu-berlin.de (Thomas Richter) wrote (abridged):

> > I think the reason is another... the reason is IMO
> > that in the C++ language unsigned integers are NOT
> > integers that can't hold negatives, but "modulo" integers.
>
> Now, that is true for "signed int" as well. If I define
> variable "int x" the largest representable integer and add
> one, I get a negative number.

Actually you get undefined behaviour. It might be a negative number on
your current platform, but the next compiler upgrade might turn it into an
integer overflow exception. And I hope it does - your program has a bug in
it and the exception could make it easier to find. Some CPUs can detect
the overflow in hardware, so it is effectively free.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Vladimir Kouznetsov

ulæst,

3. maj 2003, 06.33.2003.05.2003

til

> When was the last time you wrote a non-trivial piece of software
> straight off bug free? If you want to deal with the problem you need
> to find the problem. How do you expect to find it you never look for
> it? All engineers will make mistakes sometimes. What is wrong with
> looking for mistakes where they are easiest to detect, even if they
> need to be fixed elsewhere? Surely that's better than building a
> system on the assumption that everyone involved is perfect and then
> waiting to see how it breaks. Or whether it breaks at all, it may just
> silently do the wrong thing of course.

Anybody wants to argue with _that_? Not me! But how does that imply that
unsigned is evil? However entertaining the reading of this thread is I
still don't get that.

Let's consider the favorite pro-signed guys' example again:
void* alloc_buffer(int size) // or some such

One of the reasons was: it's easy to misuse unsigned - 0 is next door...
Let's see: int x1 = 1, x2 = 2; alloc_buffer(x1 - x2): gives us -1. Is that
inside the "normal" range? It is! Is it inside acceptable range? No. Here we
are - next door. Physically acceptable range doesn't matter - of over 4
billions variations of input parameter for operator delete (assuming that
pointers are 32 bit) only few are valid.
Well, we can check that: if (size < 0) - Aha! Look how cute that is!
So what are we doing here? We are checking if the value is in a range. What
stops us from doing the same for unsigned parameter?

That basically is a debugging check. The client code has a bug and we are
helping to find the bug. The next question is: what we are going to do if
size is negative? In release build the best what we generally can do is
abort() because the program's logic is compromised - everything else is
important for developers not for the user. Well, may be we could throw an exception.
How about std::bad_alloc? Which likely will happen with unsigned parameter
anyway...

Now imagine that function name is seek, int size is 32 bit and there are
files that can be up to 4G and you'll get what we have now: to seek to the
_other_ half we have to either seek() twice or seek() from end backward or
from start forward depending on the distance. What's the point to seek
backward from the beginning by the way? Don't forget to check that!

So what are those overwhelming advantages of signed? Just don't tell me about
mixing types in expressions - I hate implicit conversions.

>
> GJD

thanks,
v

Le Chaud Lapin

ulæst,

3. maj 2003, 15.06.2503.05.2003

til

deane...@hotmail.com (Gavin Deane) wrote in message news:<6d8002d0.03050...@posting.google.com>...

> When was the last time you wrote a non-trivial piece of software
> straight off bug free?

Four days ago. I implemented retransmission of packets using events,
semaphores, mutex's, multithreading, Associative_Queue<int, Packet *>,
etc. As usual, I used unsigned int to represent the number of items
in my containers. Ocassionally, I must call a Microsoft Windows
function that returns a polysemantic value that could be an error,
represented as a negative number, or a count, representative as a
possitive number. The positive value might be forwarded to one of my
components. Rather than blindly passing the value generated by the
Microsoft Windows function to my components, I deal with the
possibility of error first, and if there is no error, I cast and do
the forwarding. I do have one nasty bug in it that I have not found,
but I am 100% certain it is not internal to any of my C++ classes.

> If you want to deal with the problem you need
> to find the problem. How do you expect to find it you never look for
> it?

I design my software with a zero-defect mentality. While this may
seem like a silly notion, it has the amazing effect allowing me to
forget about very complex previously-designed modules of which I, at
some point, considered them to be "regular". The result is that I can
build extremely large systems as an individual, and when using them to
synthesize systems, defects practically beg to be discovered because
they have no place to hide. Yes, I do have exceptional cases, such as
code attempting to remove an item from a container. But in such
cases, I simply throw an exception. What makes this different from
throwing an exception when a negative value is passed to a function
whose argument is inherently non-negative is that throwing exceptions
on empty containers, in a disruptive sort of way, actually *increase*
the regularity of my system.

> All engineers will make mistakes sometimes. What is wrong with
> looking for mistakes where they are easiest to detect, even if they
> need to be fixed elsewhere?

Because it encourages bad engineering. There is nothing wrong
engineering fault-tolerance. But fault-tolerance should be performed
at inter-component boundaries, not within the components, unless of
course, those components are subsystems themselves and capable of
contributing to recoverability. In other words, whenever you are
working with primitives, sometimes it's better let them be "naked" and
"surrender to virtue", then use those naked primitives to deliberately
engineer a fault-tolerant system . The statement you made gives the
impression that it is good engineering practice to employ even your
most fundamental, virtuous primitives, as real-time poop alarms.

A good example is the string library. If you try use logic to
accommodate all things that can go wrong with negative values given to
strlen/strcat/strncpy within the functions themselves, you will go
bananas. Even though you may be able to detect exceptional conditions,
you will have not the foggiest idea what to do about them external to
their context. This is your first hint that you are violating the
regularity of a primitive component.

Another example is CMOS memory. They have been electrostatically
fragile for decades, yet manufacturers will only go so far in trying
to protect them. They had rather put a little sticker on it saying
"Warning: Electrostatic Discharge Will Destroy This Device"
Apparently, a warning label is currently the "sweet spot" for
engineering fault-tolerance in CMOS memories.

If a child drinks paint thinner, he will probably die, so when my
nephew comes over to play, I keep him out of my garage, because the
spouts are not child-proof. Apparently the sweet spot for protection
is no protection.

The kick-back from chainsaws can hurt real bad if you're not careful.
The first time you're cutting down a hardwood tree and you experience
severe kick-back, it's a heart-pounding experience, though no one
warns you about hardwood. The sweet spot is a label saying, "This
chainsaw is dangerous."

The point is that there is danger everywhere. The components are
*always* fragile under treacherous context. Are 10 centimeter steel
bolts robust? Not when they swim in a a bucket of nitric acid.
Airbus A300 sturdy? So sturdy that it just might survive roll-torque
caused caused by a nice thin ridge of ice on the wings at 10,000
meters. 5GHz Intel CPU fast? Yep. Watch how quickly it computes
wrong answers with 1 ppm sodium contamination.

So nothing is really fault tolerant. Knowing where to draw the line
requires global perspective and local focus. In computer science,
global perspective tells us that the vast majority of integral
quantities represented by typical software programs are inherently
non-negative. Therefore, inter-primitive regularity would be best
serve by integral interface parameters that are inherently
non-negative.

-Chaud Lapin-

Francis Glassborow

ulæst,

3. maj 2003, 15.08.2703.05.2003

til

In message <3eb29719....@News.CIS.DFN.DE>, Alf P. Steinbach
<al...@start.no> writes

>On 2 May 2003 09:15:36 -0400, Andrea Griffini <agr...@tin.it> wrote:
>>...
>>The problem start when one insists on using them as
>>non-negative integers... and somewhat the standard
>>library did this in a few places for no good reason.
>
>I'd rather like to think there are good reasons for most
>everything in the standard library.

Yes but sometimes the good reasons are not actually 'good enough'. If
you have any doubts look at the unfortunate specialisation of vector for
bool.

--
ACCU Spring Conference 2003 April 2-5
The Conference you should not have missed
ACCU Spring Conference 2004 Late April
Francis Glassborow ACCU

Andrea Griffini

ulæst,

3. maj 2003, 15.19.5203.05.2003

til

On 2 May 2003 20:55:45 -0400, al...@start.no (Alf P. Steinbach) wrote:

>On 2 May 2003 09:15:36 -0400, Andrea Griffini <agr...@tin.it> wrote:
>>...
>>The problem start when one insists on using them as
>>non-negative integers... and somewhat the standard
>>library did this in a few places for no good reason.
>
>I'd rather like to think there are good reasons for most
>everything in the standard library.

Sure, I think that too. But I also think that both
the library and the language contain a few "errors".
I'm not saying that there are no reasons for these
errors to be there, and I'm not even saying that we
should remove them. For example I think that having
the non-virtual (i.e. wrong) dispatch *by default*
is a language error, but I don't think this should
be changed; IMO it's too late for that.
When I say "for no good reasons" I simply mean that
the final net effect a decision had was bad for the
language; so the reasoning behind that decision had
problems, may be overestimating the gain or
underestimating the cost of deciding that way.

Having justifications for these errors don't make
them disappear, however. Documenting a bug doesn't
transform it to a feature.

In a few places it's really hard for me to think what
have been the reasons for what seems at first sight an
absurd decision. There are parts of the C++ language
to which I can only think there has been some strike
of the "committee effect" (a kind of lightning that
can make any logic disappear) or that they can be
classified as "pee smell" and "I was here" signs.

>>Like I've been trying to say in my bad english sure
>>you can get bad behaviour of signed integers too.
>>You even get undefined behaviour... but that happens
>>FAR from normal uses,
>
>IMO that's a very false sense of security. For a typical C++
>compiler modulo arithmetic is used in both cases. Pretending
>that it isn't, on the grounds that it _often_ works, makes for
>very hard to find bugs in the cases where it doesn't work.

I've been hit by integer overflow bugs too, and when that
happened to me most often having unsigned values used
instead wouldn't have gained nothing.
Ignoring that issue is surely dangerous, but the kind
of "surprising" behaviour unsigned int show is in my
experience far more common and annoying.
Also while avoiding integer overflow can be quite
costly if you just want to do that by using an extended
type, avoiding the "unsigned int" most common problems
is often a zero cost operation, just don't use unsigned
and use ints instead.

>>IMO using [unsigned] as parameters so you don't have to do
>>checks for negative values is nonsense.
>
>This has already been discussed, but in your answer to that
>all but the initial introductory comments were elided, which
>together with the above leads me to think that a recap might
>be useful.

I think that just a prototype is quite rarely enough to
be able to use a function without doing mistakes.
Also I think that putting every precondition in the
prototype is not the heaven. Programs never reflect 100%
of a problem, the model is a simplification and reduction;
what to put in and what not is a decision made considering
the cost and benefit of the inclusion.
For unsigned parameters I don't feel really that bad,
because the cost is lower, I think that using unsigned so
that you can *avoid checking* is however nonsense, and
THAT is what I said. Also I think that in many cases those
checks are more meaningful and readable if signed
parameters are used.
Unsigned return values and variables on the other side
have a serious cost in "surprise effect", and I happen to
think that the gain is rarely able to justify this cost.

Do you think this surprise effect doesn't exist ? That
the cost is low ? My opinion and personal experience is
different on this point but I suppose there's nothing I
can tell to change your mind on this (nor nothing you
can tell me to change my mind on this; probably we
have different kind of scars).

Are you advocating the removal of the "surprising"
behaviour of the language ? I think it's far too late
for that. Unsigned ints are not integers that can't
be negative because they behave "strangely" if you
watch them under that light (subtracting an integer
that can't be negative from an integer that can be
negative the result can't be negative - to me - doesn't
sound logical at all), but if you don't get fooled by
the name then they can indeed be useful.
There are places in which they fit perfectly, just
those places are IMO *not* everywhere an integer that
can't have negative values appears.

>Let's try to do the dance again then.

Let's not. I hope I was able to show what my point is.

>A user-defined type
>may make the range explicit but at the cost of efficiency and
>introducing extra file relationships.

I think that if you're for expressing everything in the
prototype then that's the way to go. I think about
programming somewhat differently however.

>Which alternative 'alloc' contract C do you have in mind where
>'int' is for some reason a better choice than 'unsigned', or
>'long' is for some reason a better choice than 'unsigned long',
>and most important, why would signed be better for contract C?

I'll try to draw a parallel at a diffrent level.
Suppose you've an accounting program that asks for
a customer code from the keyboard, and suppose that
you've another version in which the customer code
is chosen from a list.

The second program would be error free just because
it's impossible to enter a wrong customer code ?
The answer depends on how do you look at the
operation being carried... the form doesn't need
ANY customer code, it needs the correct customer
code for the transaction being processed, not another
(valid) customer code.
There are cases in which is better to have an invalid
code than having the valid code of the wrong customer.
The version that allows choosing from the list has
some advantage in speed of entering the code (if the
customers are not too many) and avoiding some kind
of typos, but has the drawback of trasforming other
kind of typos in a subtler form of error.

I think that using an unsigned number as a parameter
(like using parameters of type "char" or "short") has
the small advantage of better documenting, but the
defect of relying on subtle implicit conversions to
change values between when the user calls the function
and when the function receives the parameter.
I happen to think that the small gain is not worth
the defect... but of course that's just my opinion.

What made me jump on my chair was someone saying
that if you use an unsigned parameter then no check
is needed, because the value is correct.
I think that with C++, with what unsigned means in
the language and considering how unsigned are handled
in expressions, this is nonsense.

Andrea

nobody

ulæst,

3. maj 2003, 20.40.1603.05.2003

til

Le Chaud Lapin wrote, in reference to 3.9.1/4 and 4.7/2:

> I do not understand. Assuming a 32-bit sign-magnitude environment,
>
> What is the bit pattern for -1? Is it (for example), 0xFFFFFFFF or
> 0x80000001?

-1 is a signed integer literal, and the corresponding bit pattern
(when this value is stored in a signed int) is 0x80000001.

> What is the bit pattern of 's' after executing the following code?
>
> signed int r = -1;

The bit pattern for r is 0x80000001.

> signed int s = r; // What is the bit pattern for s?

The bit pattern for s is 0x80000001. No conversion is
required, as s and r have the same types.

> What is the bit pattern of 'u' after executing the following code?
>
> unsigned int u = 0U;
> --u; // What is the new bit pattern of u?

0xFFFFFFFF ... because of the "modular arithmetic" rule in 3.9.1/4.

Notice that we'd get the same thing from any of the following three
statements:

u = -1U; // no conversion, bit pattern for -1U is 0xFFFFFFFF per 3.9.1/4
u = -1; // involves int -> unsigned conversion, see 4.7/2
u = s; // ditto, assuming r = -1 ; s = r as above, per 4.7/2

nobody

John Potter

ulæst,

3. maj 2003, 20.43.0403.05.2003

til

On 3 May 2003 06:19:46 -0400, unorigina...@yahoo.com (Le Chaud
Lapin) wrote:

> > Note that a signed-to-unsigned conversion DOES entail a change to the bit
> > pattern in a sign-magnitude representation when the source integer is
> > negative.

> I do not understand.

There is no need to understand. In a 32-bit environment, the C++ (and C)
standard mandates that unsigned(-1) is 2^32 - 1. The compiler must
generate the code to make it happen regardless of the hardware.

> Assuming a 32-bit sign-magnitude environment,

> What is the bit pattern for -1? Is it (for example), 0xFFFFFFFF or 0x80000001?

The latter; however, unsigned(-1) is the former.

> What is the bit pattern of 's' after executing the following code?

> signed int r = -1;
> signed int s = r; // What is the bit pattern for s?

No content to question.

> What is the bit pattern of 'u' after executing the following code?

> unsigned int u = 0U;
> --u; // What is the new bit pattern of u?

All 1s.

u = r; // Also all 1s.

Stop trying to use common sense, this is law. The standard mandates
unsigned(-1) is 2^N-1 regardless of the hardware. Compiler writers
must get it right.

John

Thomas Richter

ulæst,

4. maj 2003, 05.56.2104.05.2003

til

Andrea Griffini wrote:

> >Now, that is true for "signed int" as well. If I define
> >variable "int x" the largest representable integer and add
> >one, I get a negative number.
>
> No. You formally get undefined behaviour. Ok... probably
> 95% of environments out there just give you a negative
> number; but that's not guaranteed.

Thanks, my mistake. You're right.

> >This causes a number of caveats for sure, but I still think
> >functions like "strlen" should return something unsigned
> >because it is very natural.
>
> It's not natural because unsigned are not integers...
> May be you find natural that computing
>
> double x = strlen(s) - 1;
>
> gives you a huge positive number with an empty string ?

Yes, because you're making a mistake here. You perform subtraction
within the group (Z_{2^n},+), but this is not the kind of operation
you'd need to do. There's also an implicit cast in the above line that
should be reported. "1" is an int, and "strlen" returns size_t. The
error would be more obvious if you'd had to write

strlen(s) - 1U

to avoid the warning: Here we have a subtraction of two unsigned, and
you should be at least be worried why you'd had to write 1U instead of 1
to get no warning.

> Do you find natural that if I do
>
> for (int i=0; i<strlen(s)-1; i++)
> ...
>
> then the loop is going to take forever and access outside
> the string if the string is empty ?

I don't think this loop makes too much sense in first place (why don't
you want to handle the last character in the loop? Note that strlen(x)
returns the number of characters in the string *excluding* the NUL). If
I want to perform this, then I *should* check for the special
zero-character case since it is indeed special. Besides, I would expect
a compiler to possibly warn me here (subtraction of signed from
unsigned, asking for trouble).

> The problem is that for them being consider "integers
> that can't be negative" then it's nonsense to have
> the difference of two unsigned being unsigned

Well, not really, as this operation is indeed well defined in a suitable
group, though not want you're most likely asking for. However, this is
not what happens here.

> nor the
> opposite of an unsigned being an unsigned, nor it
> doesn't make sense that any unsigned int compare as
> less than or equal to -1.

Yes, and here a compiler should warn as well. GNU g++ emmits a warning
if you compare signed with unsigned.

> >There is no string whose length is negative.
>
> So ?

Definitely. (-;

> Still I don't think it's obvious or even useful
> thinking to the lenght of a string as a member of
> Z_{2^N} (because THAT is what unsigned ints are in C++).

Are int's elements of Z? Neither!

> May be there are cases in which this is what I want, but
> IMO those are the singularities, not the general case.

I think you're overly stressing the case of the "wrap around"
problem of the unsigned int's. I wouldn't say that this is "too often" a
problem. It could be a problem - it is a price you've to pay - but you
get something back in return. See my arguments below.

> >I don't agree here. If I write a cast to the apropriate
> >signed type explicitly, then that's a note to the compiler
> >as "Yes Sir, I thought about what I'm doing here, it's
> >really fine", whereas otherwise I could have just overlooked
> >what's going on.
>
> I think that having to cast all indicies to std::vector
> elements could be quite annoying.

Well, a warning is a warning. If you feel anoyed, turn it off.
It's not that C++ prescribes where and how a compiler should warn. I
just think that this could be a worthful service.

> >ii) When taking differences containing unsigned types
> >resulting in unsigned's, expecting results as if we had (Z,+).
>
> What do you think is more natural as the result of
> taking the lenght of an empty string and subtracting one ?

Well, it is in so far not "natural" as you're understanding this
subtraction as if it happens on the semi-group (N,+). And once you grasp
that subtraction is *not* well defined here, you see that you're asking
for trouble with subtraction on non-negative numbers. That's why I say
that a compiler warning here could be helpful.

> >Depends of course on the programmer as well. I feel
> >perfectly happy with unsigned's, and I'm aware of the
> >limitations.
>
> I've been hit a couple of times only, but it hurts.
> And I've found myself explaining to other programmers
> why their code was wrong in quite a few occasions and,
> believe me, the C++ position is hardly defendible.
> Documenting a bug doesn't change it to a feature; if
> most programmers get this wrong then IMO it's the
> language that is broken.

*Shrug.* Once you're opening this issue, then there are quite a number
of points where C++ is IMHO broken. /-: But it's too late to correct
these, leave alone that backwards compatibility to C was, in fact, a
necessity and a design decision.

> Like I said before this of course assuming that
> you're writing programs for a reason, and that the
> working program is what you're seeking.

We're entering a very vague terrain here, and I don't feel very
comfortable discussing it. *I* feel mostly happy with C++ as we have it
here, but maybe I'm too used to it in first place. If we would discuss
whether C++ is a very pedagogical language (as in "it's easy to teach"),
then possibly I wouldn't agree. There are more issues "broken" in C++
than "unsigned's". (-;

Greetings,
Thomas

Le Chaud Lapin

ulæst,

4. maj 2003, 06.01.2704.05.2003

til

vladimir....@ngrain.com (Vladimir Kouznetsov) wrote in message news:<87f8c72.03050...@posting.google.com>...

> Now imagine that function name is seek, int size is 32 bit and there are
> files that can be up to 4G and you'll get what we have now: to seek to the
> _other_ half we have to either seek() twice or seek() from end backward or
> from start forward depending on the distance. What's the point to seek
> backward from the beginning by the way? Don't forget to check that!

Yep. Another anecdote:

A while back, about 20 years I'd say, there was an advertisement
issued by a company that created products in the data storage market.
They proudly announced a new feature that their competitors could not
touch: "DOUBLE YOUR ADDRESSABLE DATA STORAGE CAPACITY FROM 2GB TO
4GB!!!" The cost of the software upgrade was several thousand
dollars. Any programmer seeing this ad knew what happened: someone
used 'signed long int' when 'unsigned long it' would have been
conceptually appropriate, and someone else devised a really clever
technique to undo what the other person had done while making the new
system backward compatible.

But since many of us were just getting born then, let's turn our
attention instead to a more contemporary, real-life example
illustrating poor choice of functional form and parameter type. It is
the notorious SetFilePointer() from Microsoft.

The problem was simple enough. There was this thing called a file.
It had zero or more bytes in it. We needed to set its read/write
index to a particular byte.

What Microsoft could have done was define four simple functions:

bool set_index_absolute (unsigned long int handle, unsigned long int
index);
bool set_index_backward (unsigned long int handle, unsigned long int
offset);
bool set_index_forward (unsigned long int handle, unsigned long int
offset);
bool get_index (unsigned long int handle, unsigned long int &index);

(or their conceptual equivalents)

This is very clear. There is very little mystery, and the mystery
that exists is manageable. You are content with the notion that 4GB
is available. The problem of reachability beyond 4GB is a separate
issue and would still have to be addressed by any type scheme (no pun
intended). But Microsoft, like the Unix people, could not resist the
temptation to go into hyper-clever mode: They would combine three
functions into one, using signed indices to control seek behavior.
The result is weird enough in Unix, but check out how bad it gets when
Microsoft takes a stab at it:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/base/setfilepointer.asp

Here's an excerpt for those of you would rather not click on the link:

"To move the file pointer from zero to 2 gigabytes,
lpDistanceToMoveHigh must be set to either NULL or a sign extension of
lDistanceToMove. To move the pointer more than 2 gigabytes, use
lpDistanceToMoveHigh and lDistanceToMove as a single 64-bit quantity.
For example, to move in the range from 2 gigabytes to 4 gigabytes set
the contents of lpDistanceToMoveHigh to zero, or to 1 for a negative
sign extension of lDistanceToMove."

What an abomination. All you wanted to do is set the file pointer,
and instead you find yourself spending at least 15 minutes reading and
rereading the horrific man page to make sure that you are covering all
scenarios. Would you honestly feel confident building complex systems
with functions like these serving as your primitives?

-Chaud Lapin-

Thomas Richter

ulæst,

4. maj 2003, 06.02.0404.05.2003

til

Hi,

> > Now, that is true for "signed int" as well. If I define
> > variable "int x" the largest representable integer and add
> > one, I get a negative number.
>
> Actually you get undefined behaviour. It might be a negative number on
> your current platform, but the next compiler upgrade might turn it into an
> integer overflow exception. And I hope it does - your program has a bug in
> it and the exception could make it easier to find. Some CPUs can detect
> the overflow in hardware, so it is effectively free.

Thanks for the correction, you're right of course.

Greetings,
Thomas

Thomas Mang

ulæst,

4. maj 2003, 06.05.3204.05.2003

til

Andrea Griffini schrieb:

> On 2 May 2003 12:07:19 -0400, Thomas Richter
> <th...@cleopatra.math.tu-berlin.de> wrote:
>
> >Now, that is true for "signed int" as well. If I define
> >variable "int x" the largest representable integer and add
> >one, I get a negative number.
>
> No. You formally get undefined behaviour. Ok... probably
> 95% of environments out there just give you a negative
> number; but that's not guaranteed.
>
> >This causes a number of caveats for sure, but I still think
> >functions like "strlen" should return something unsigned
> >because it is very natural.
>
> It's not natural because unsigned are not integers...

I don't understand that comment. Or did you mean "signed int
[inC++-sense] are not unsigned int[in C++ - sense]"?

>
> May be you find natural that computing
>
> double x = strlen(s) - 1;
>
> gives you a huge positive number with an empty string ?
> Do you find natural that if I do
>
> for (int i=0; i<strlen(s)-1; i++)
> ...
>
> then the loop is going to take forever and access outside
> the string if the string is empty ?

The problem here is what to consider "natural".

I totally agree with Thomas Richter (and others) that the type of objects
which can't be negative (sizes for example) should be unsigned. Not only
because one more bit gives us a larger range, but mostly because of
coding expressiveness.
I am glad the STL uses unsigned types for sizes.

My experience is many people use plain int because they want to avoid
certain checks - and in practice avoid the checks, although there is no
strong argument that the check may be really skipped.

As someone else pointed out precisely, overflow/underflow may happen with
both unsigned and signed types, but the values "usually" used are much
closer to the lower bound of unsigned int [that is, 0] than to either
bound of signed ints.
I have very rarely seen checks for overflow/underflow when using signed
ints. But actually, what assumption is behind that? I'd say, usually
ignorance.
To go to your example, what assumption is there that strlen(x) - 1 does
not underflow? Indeed, the code is not underflow safe. A check would be
needed:

if (strlen(x) > std::numeric_limits<int>::min())
int y = strlen(x) - 1;
else
....whatever

And a check would be needed if we were using an unsigned type. But with
the advantage that an unsigned type gives us a more sensible clue as to
what is represented (negative sizes are hardly possible) and another bit
of representable values.

All too often, people simply ignore checks for overflow/underflow when
using signed integers. I don't know what exactly the reasoning behind
this is - it may be "I am using signed types, so I can skip the check",
or the other way around "I want to skip the check, so I use signed ints"
- but IMO this does not make sense. Signed ints have (no surprise) bounds
too. They also have not more or less bounds than unsigned ints (again, no
surprise). So what GENERAL advantage have signed ints over unsigned ints?
Subtracting one from a signed int has equal chances for underflow than
subtracting one from an unsigned int. That's a fact. Now one can argue
that values "usually used" will cause underflows in unsigned types much
more frequently. Probably true. I have to deal much more with numbers in
a range of, say 0 - 100 than with numbers in range of say -2billions and
some thousands. However, this is a problem dependent PROBABILITY, but (in
most cases) no guarantee. And unless there is a guarantee, checks are
necessary.

But in my experience, I have seen these checks much less frequent when
dealing with signed ints than when dealing with unsigned ints, although
NO guarantee for possible underflow/overflow was available. IMO all too
often this is ignored and programmers treat a high probability as a 100%
fact - what it isn't.
And overflow/underflow checks can be much more complicated for signed
ints than the corresponding checks when dealing with unsigned arithmetic.

regards,

Thomas

John Potter

ulæst,

5. maj 2003, 06.38.5905.05.2003

til

On 4 May 2003 06:05:32 -0400, Thomas Mang <a980...@unet.univie.ac.at>
wrote:

> I have very rarely seen checks for overflow/underflow when using signed
> ints.

A simple technical correction. Overflow refers to getting too large in
magnitude to be represented. It happens in both directions for signed
ints, unsigned ints and floating types. Underflow refers to getting
too small in magnitude to be represented. It happens in floating types
only. A small fraction becomes zero.

John

Dave Harris

ulæst,

5. maj 2003, 06.39.3305.05.2003

til

vladimir....@ngrain.com (Vladimir Kouznetsov) wrote (abridged):

> Now imagine that function name is seek, int size is 32 bit and there are
> files that can be up to 4G and you'll get what we have now: to seek to
> the _other_ half we have to either seek() twice or seek() from
> end backward or from start forward depending on the distance.

There are some cases where getting an extra bit is really valuable. This
is one of them.

However, these are really the exception. To use unsigned here is arguably
exactly the kind of bad engineering which the original poster complains
about - letting a special case leak into the main code. A better solution
might be to encapsulate file offsets within a new class, which can use a
private representation.

> So what are those overwhelming advantages of signed? Just don't tell
> me about mixing types in expressions - I hate implicit conversions.

No fair! The rules for mixed signed/unsigned expressions are one of the
major reason for avoiding unsigned.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Dave Harris

ulæst,

5. maj 2003, 06.40.2205.05.2003

til

unorigina...@yahoo.com (Le Chaud Lapin) wrote (abridged):

> the notorious SetFilePointer() from Microsoft.
>

> [API which passes 64-bits in two 32-bit halves]

>
> What an abomination. All you wanted to do is set the file pointer,
> and instead you find yourself spending at least 15 minutes reading and
> rereading the horrific man page to make sure that you are covering all
> scenarios. Would you honestly feel confident building complex systems
> with functions like these serving as your primitives?

It's a low-level, C interface, designed to support 64-bit files in a
32-bit world. In practice it should be wrapped by a higher level
interface, either using a 64-bit "long long" or, more likely, using the
exact representation encapsulated by a class.

The documentation can be summarised by saying it is a 64-bit int passed in
two halves. The rest of it is just explaining what that means.

> [alternative API]

> The problem of reachability beyond 4GB is a separate issue and
> would still have to be addressed by any type scheme (no pun
> intended).

It's not a separate issue. It seems to me that API is specifically
intended to support 64-bit files. Using unsigned long is an ugly bodge
which does not solve the real problem. You would end up with two APIs for
the same job. If you're going to do that, better is:

bool set_index_absolute( HANDLE hFile, long index );
bool set_index_absolute( HANDLE hFile, long long index );

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Thomas Mang

ulæst,

5. maj 2003, 06.47.4905.05.2003

til

Thomas Mang schrieb:

> To go to your example, what assumption is there that strlen(x) - 1 does
> not underflow? Indeed, the code is not underflow safe. A check would be
> needed:
>
> if (strlen(x) > std::numeric_limits<int>::min())
> int y = strlen(x) - 1;
> else
> ....whatever

Sorry for possible confusion. This example assumed what would have to be done
were it std::strlen returned a signed int.
Also, it would be better to replace strlen in the example with some
less-known function returning an int; one that is not explicitly specified by
the Standard.

Andrea Griffini

ulæst,

5. maj 2003, 06.52.2605.05.2003

til

On 4 May 2003 06:05:32 -0400, Thomas Mang <a980...@unet.univie.ac.at>
wrote:

> > >This causes a number of caveats for sure, but I still think

> > >functions like "strlen" should return something unsigned
> > >because it is very natural.
> >
> > It's not natural because unsigned are not integers...
>
>I don't understand that comment. Or did you mean "signed int
>[inC++-sense] are not unsigned int[in C++ - sense]"?

With "integers" we usually means members of Z; the C/C++
int type tries to mimic that but of course has limitations;
and when it's not able to follow you too far then you're
thrown to the undefined behaviour area.
C++ "unsigned" types however don't try to model integers
as member of N+{0}, but they're designed as elements of the
finite ring Z_{2^n} (matematicians named Z_{m} as the
ring made of elements 0,1,...,m-1 where the additive and
multiplicative operations are the usual operations followed
by a "mod m" reduction to get a number between 0..m-1).

Sure one can map the integers 0..2^n-1 to the elements of
that ring, but the conversion rules (sometimes) work clearly
against this view.
For example if z is in Z, and n is in N then z-n, z+n,
z*n are elements of Z, but C++ rules are different; and
for ints and unsigned ints of the same bit size you get
int - unsigned, int + unsigned and int * unsigned all
being unsigned.

Does this seem natural to you ? To me it doesn't at all.
Also it doesn't seem natural to everyone that slips and
writes code like "i<v.size()-1"; and even if you may not
believe me there are many of those errors made by who
approaches C++.

If I write x*v.size() where x is negative, does it make
sense I get a strange huge number ? I really can't believe
in bona fide of who says that this is natural.

Sure we have a problem in C++, implicit conversions
sometimes are dangerous; and here implicit conversions
from ints to unsigned (that are a strange type that
behave quite differently from integers) are biting.
But those conversions IMO are here to stay.

I'm not saying that unsigned types are useless...
actually they've a place, and they're quite useful
in their place. I just think that with existing language
rules the unsigned are often misused; they are for
example used where an integer that can't be negative
is needed... but IMO int types are nearer to such a
concept than the C++ unsigned types (i.e. the members
of Z_{2^n}).

Note that the "strange" behaviour of unsigned doesn't
happen at the "boundary" of the domain... in the case
of x*v.size() you get an illogical result even with
x==-2 and v.size()==2.

> > May be you find natural that computing
> >
> > double x = strlen(s) - 1;
> >
> > gives you a huge positive number with an empty string ?
> > Do you find natural that if I do
> >
> > for (int i=0; i<strlen(s)-1; i++)
> > ...
> >
> > then the loop is going to take forever and access outside
> > the string if the string is empty ?
>
>The problem here is what to consider "natural".

Do you think that the most natural output of the program

#include <iostream>
#include <vector>

int main()
{
int x = -2;
std::vector<int> v(2);
std::cout << x*v.size() << "\n";
return 0;
}

is 4294967292 (what I get with one of the
compilers I use) ?

I'm not saying that those results are technically
wrong; but I really have no words for discussing
with people that would classify that behaviour as
"natural".

>I totally agree with Thomas Richter (and others)
>that the type of objects which can't be negative
>(sizes for example) should be unsigned.

Too bad there is no unsigned type in C++, and
even worse that such a cool name "unsigned" was
stolen for something else.
It's also a true misfortune that this other thing
has a bad effect when mixed with ints (probably
the most common native type used) due to the
strange language rules.

>Not only because one more bit gives us a larger
>range, but mostly because of coding expressiveness.

They have an extended range, but just that single
bit doesn't seem to me enough to dive into the
related complications.

>I am glad the STL uses unsigned types for sizes.

I'm not. And I'm glad you're not one of my collegues
because you probably would design your classes making
the same errors.

>My experience is many people use plain int because they want to avoid
>certain checks - and in practice avoid the checks, although there is no
>strong argument that the check may be really skipped.

I think I lost you on this. How the use of signed
int would allow to avoid checks ?

>As someone else pointed out precisely, overflow/underflow may happen with
>both unsigned and signed types, but the values "usually" used are much
>closer to the lower bound of unsigned int [that is, 0] than to either
>bound of signed ints.

0 is not a lower bound for unsigned ints, it's
perfectly safe to go beyond that limit and this
is a very nice property of unsigned types.
C++ unsigned int types are indeed a way more
perfect model than the signed integer model;
however they're a model of a strange finite ring.

>I have very rarely seen checks for overflow/underflow when using signed
>ints. But actually, what assumption is behind that? I'd say, usually
>ignorance.

Or may be you're confident that for the values
being used no overflow can occur.

>So what GENERAL advantage have signed ints over unsigned ints?
>Subtracting one from a signed int has equal chances for underflow than
>subtracting one from an unsigned int. That's a fact.

No. Not in real programs. This is at least my experience.

>Now one can argue that values "usually used" will cause underflows
>in unsigned types much more frequently. Probably true.
>I have to deal much more with numbers in a range of, say 0 - 100
>than with numbers in range of say -2billions and some thousands.
>However, this is a problem dependent PROBABILITY,

I think I misunderstood your previous sentence then. I considered
"chances" being related to probability.

>but (in most cases) no guarantee. And unless there is a guarantee,
>checks are necessary.

I do no understand what your point is. I simply said that
C++ "unsigned" logic is unnatural, and that it creates
problems when unsigned C++ types are used to represent
integers that can't be negative.
You say that having "unsigned" is better documenting, but
IMO that's just an unfortunate coincidence, because
"unsigned" in C++ doesn't mean an element of N+{0}, but
it means an element of Z_{2^N} plus a set of strange
implicit conversion rules.

>But in my experience, I have seen these checks much less frequent when
>dealing with signed ints than when dealing with unsigned ints, although
>NO guarantee for possible underflow/overflow was available.

C++ unsigned ints can't overflow or underflow (that is
a nice property of the modulo arithmetic). I really can't
follow what you're saying.

>IMO all too often this is ignored and programmers treat a high
>probability as a 100% fact - what it isn't.

I don't need to have *all* uses of unsigned types to create
a problem to come to the conclusion that unsigned types are
a problem. I just saw how often they create a problem, and
what is the good they can bring to you.
My conclusion is that using unsigned types for all integers
that can't get negative is an error. That approach would just
IMO lead to worse programs.

Andrea

Francis Glassborow

ulæst,

5. maj 2003, 16.13.0805.05.2003

til

In message <dvrabv86vrqq8eidr...@4ax.com>, Andrea Griffini
<agr...@tin.it> writes

>With "integers" we usually means members of Z; the C/C++
>int type tries to mimic that but of course has limitations;
>and when it's not able to follow you too far then you're
>thrown to the undefined behaviour area.
>C++ "unsigned" types however don't try to model integers
>as member of N+{0}, but they're designed as elements of the
>finite ring Z_{2^n} (matematicians named Z_{m} as the
>ring made of elements 0,1,...,m-1 where the additive and
>multiplicative operations are the usual operations followed
>by a "mod m" reduction to get a number between 0..m-1).

Thanks Andrea. I think this is the nub of the problem. The signed and
unsigned types in C++ are actually very different things which makes it
particularly unfortunate that we have these complicated implicit
conversion rules between them. In an ideal world they would not be
implicitly inter-convertible.

Now I can live with the co-existence of signed and unsigned types as
long as nothing 'forces' me to remember about their differences and the
dangerous inter-conversions.

Left to my own devices I would always use signed for arithmetic (and
strongly argue for out of range results to throw an exception). I would
confine my use of unsigned to bit twiddling. Unfortunately I am not
given that freedom because of all the places in the Standard C++ Library
that use unsigned return values.

The only recourse I have is to strongly urge compiler implementers to
provide diagnostics for all mixed signed/unsigned expressions. I would
really like that to be a switch all by itself quite distinct from
warning levels because mixed signed/unsigned expressions are always
suspect.

--
ACCU Spring Conference 2003 April 2-5
The Conference you should not have missed
ACCU Spring Conference 2004 Late April
Francis Glassborow ACCU

Le Chaud Lapin

ulæst,

5. maj 2003, 17.03.1205.05.2003

til

deane...@hotmail.com (Gavin Deane) wrote in message news:<6d8002d0.03050...@posting.google.com>...

> All engineers will make mistakes sometimes. What is wrong with

> looking for mistakes where they are easiest to detect, even if they
> need to be fixed elsewhere?

A question for the pro-signed people:

How would you write strncpy()? To my surprise, after looking up the
spec, I discovered that it has been modified to use size_t from int,
but lets forget about that and pretend that you could do it
yourselves. Naturally, you would use 'int' for 'n'. What I am
particulary interested in is how your implementation would handle the
following piece of code:

char s1[16] = "Hello";
char s2[16] = "World";

strncpy (s1, s2, -20);

What does you strncpy do?

-Chaud Lapin-

I V

ulæst,

6. maj 2003, 05.47.3506.05.2003

til

On Mon, 05 May 2003 17:03:12 -0400, unorigina...@yahoo.com (Le
Chaud Lapin) wrote:
> A question for the pro-signed people:
>
> How would you write strncpy()? To my surprise, after looking up the spec,
> I discovered that it has been modified to use size_t from int, but lets
> forget about that and pretend that you could do it yourselves. Naturally,
> you would use 'int' for 'n'. What I am particulary interested in is how
> your implementation would handle the following piece of code:
>
> char s1[16] = "Hello";
> char s2[16] = "World";
>
> strncpy (s1, s2, -20);
>
> What does you strncpy do?

Abort, probably. I'd start the function with:

void strncpy(char* dest, const char* src, int n)
{
assert(dest != NULL);
assert(src != NULL);
assert(n >= 0);

...
}

An implementation using unsigned for n would silently do
the wrong thing. In what circumstances is that better?

What I'd ideally do, BTW, is make the function

void strncpy(char* dest, char* src, natural<int> n);

where natural is a class template that implements proper semantics for
non-negative numbers. In that case, strncpy(s1, s2, -20) wouldn't compile.

--
"Not bad meanin' bad, but bad meanin' good,
Damn I'm so hood"
http://ivlenin.web-page.net/